The technical pitfalls of a “proprietary stack” (and how to avoid them)

Christian Umbach

The Idea in Brief:

  • In the quest to “own data,” aggregators will dedicate valuable time to reinventing the wheel of consolidation, clarification, and mapping, and assume the ongoing internal cost of maintenance for these functions. Time would be better spent on proprietary modeling, machine learning, and operational automation.
  • Tools designed for sellers will result in repetitive costs, tradeoffs in functionality and performance, and reporting not purpose-built for managing a portfolio of brands: choose tools designed for aggregators from the outset.
  • Most tools are built on the old Amazon MWS Services and forfeit the many improvements brought about by the Amazon Selling Partner API.
  • The business imperative: Focus teams on “building” upon proprietary knowledge in modeling and transforming data, rather than in consolidating, cleansing, and mapping it. “Buy” into partnerships that share the portfolio-first mindset to avoid the repetitive, non-value added efforts and costs that come with scale.


Dealing with the eCommerce data ecosystem is a complex beast: the complexity of the digital world is meeting the complexity of the physical world. 

Seeing the road ahead means going beyond the stack and understanding the complexities of the APIs in the domain (Image Credits: ThisisEngineering RAEng)


When it comes to defining a technical stack, we have seen a number of teams quickly drive decisions on the first part (The Fast Core Stack) but then get stuck on part two (The API Details). We are going to untangle those decisions a bit to better understand why: 


Part 1 - The Fast Core Stack 


This section covers the tooling aspects of gathering, storing, and visualizing data from Amazon. Specifically, you want to cover the following parts in your stack: 


Common components of a Fast Core Stack:

  • Storage: BigQuery, Redshift,  traditional databases, and SnowflakeCore ETL Tooling:Tools like Fivetran help run pipelines; tools like Openbridge and Daton help connect to Amazon and other eCommerce pipelines
  • Integration: Firms like Celigo, Folio3 and Linnworks have traditionally focused on integrations with Amazon / other marketplaces and NetSuite. (Aggregators should watch out for pitfalls in commercial model or stack usability when it comes to scaling across regions and marketplaces.)
  • ERP: NetSuite is the name of the game so far, with SAP, MS Dynamics, and niche providers occasionally appearing. 
  •  Data Visualization: Lookr, Data Studio, Tableau, and others. 


Here are a few Fast Core Stack pitfalls to look out for, especially when planning for growth:

  • Commercial Model: Most tools were built for individual sellers, not aggregators. Therefore the costs of licensing and brand-by-brand reconciliation and mapping become highly repetitive as portfolios scale.
  • Technical Scale: Again, with tools built for individual sellers, the necessary simplifications to elevate important actions and KPIs across a portfolio are missing, resulting in repetitive work or even key gaps in capability for an aggregator. 
  • Outdated Inputs: A number of tools leverage the old  Amazon MWS Services instead of their Selling Partner APIs, trading off the functionality, performance, and ease of use addressed with the newer services. It would also not be unlikely that the MWS Services will lose support in the coming years.


The biggest factor: Aggregators will encounter novel challenges in managing portfolio and scale that many existing tools have not yet considered and were not architected for in focusing on individual sellers.


Part 2 - The API Details


Even with a Core Stack in place, “build”-focused teams will have to address many non-value added issues on their own. Both required domain knowledge and fine grained details of the platform can make the work with the APIs time-consuming. With Amazon alone, they will face issues of:


Access: 

  1. The perceived “control” offered by a homegrown solution for accessing seller accounts will often create seller friction and legal risks, especially if credentials have to be shared.
  2. Depending on how access into seller accounts are built, it may also differ once a brand is acquired and would require repetitive “re-integration.”

Mapping:

  1. There are two main Amazon APIs (Selling Partner, or SP-API, and the Advertising API). The SP-API has 18 different endpoints with 126 actions, while the Ad API has just 4 endpoints but 151 different actions. Each of these will need to be mapped to their appropriate KPIs, which will require time from subject-matter experts like M&A teams, who are already stretched for time. The process will be highly manual, non-differentiated, and different for every new data source.
  2. Additionally, there are more than 20 reports in Amazon which span 6-50 different columns, also requiring the same manual, non-value added mapping
  3. To reconcile a P&L alone, teams will map more than 170 items to their according fields. Once fields are mapped, data must be validated and likely troubleshooted.

API Limiting and Timing:

  1. When dealing with the Amazon APIs, Amazon requires handling its token bucket handling, parameterization, and next token mechanisms to get a full picture of the data -- in short, teams will be at the mercy of Amazon’s rate limits and will also have to grapple with working around this on an ongoing basis
  2. When dealing with reports programmatically, retrieving a report actually means requesting a report to be generated, waiting for it to be generated, and fetching the report from an S3 bucket from Amazon

Although issues with Fast Core Tech Stack are costly, repetitive, and not unique, they are surmountable. It’s actually making sense of and using the data with the API Details that will set teams back. In this, Xapix has already addressed the issues with a portfolio mindset, so you can focus on actually making decisions with data and building exciting applications and machine learning around it. Even if just to automate workflows, those workflows are proprietary in the way that data retrieval never can be. Purpose-built for data and API orchestration, we’ve architected Xapix specifically for aggregator use cases, whether through dashboards, notifications, reports, database connections, or via API.  

We’re here to dig in with you, to be a friend in overcoming challenges like token bucket handling, or simply to help you assess where you stand and where building vs. buying will really pay off.