No one likes making mistakes.
And yet, making mistakes and experiencing their painful consequences is the fastest way to learn.
I want to share my most painful mistakes from my time as Head of Business Analytics at Rocket Internet - then the world's largest venture builder.
When I joined in 2011, Rocket Internet's business model was to rapidly launch numerous e-commerce businesses, mainly in emerging markets such as India, South East Asia, and South America.
My job was to build data infrastructure and data teams for all of these companies.
More specifically: to replicate the data analytics setup of a company called Zalando.
Zalando had pioneered advanced marketing attribution modeling and data-driven decision-making, becoming Europe's leading fashion retailer.
Thanks to their incredible data setup, they were the first company in Europe that could tell exactly and in real-time how much customer lifetime value they earned per dollar spent on marketing - with amazing precision down to the keyword level.
This helped them outspend and outgrow all competitors in record time. Our mission was crystal-clear: copy their winning formula across all other companies.
Here are all the mistakes I made on this mission and what I learned from them. Let's go!
Mistake 1: Starting with a huge concept and theoretical framework
We started by analyzing Zalando's data infrastructure setup. Then we wrote a 132-page document describing the tables, dashboards, and pipelines we wanted to build.
Learning 1: We wasted too much time on theoretical concepts. We should have started building and shipping earlier.
Learning 2: Building and shipping wasn't possible because none of the portfolio companies were operating yet. We were way too early with our initiative. Never launch a data infrastructure before product-market fit!
Mistake 2: Covering every use case under the sun
Our data infrastructure was designed to serve every use case in every business domain right from the start.
Here's what we covered:
Learning 3: It's much better to serve one use case perfectly than serve dozens of use cases sub-optimally.
Mistake 3: Connecting too many data sources
Every data source you add to your data infrastructure carries f*ck up potential.
This is especially true if the operational database serving your data infrastructure is still under development and changing at an incredible rate. Serving every possible use case meant we needed to integrate an insanely large number of data sources. Needless to say, this led to a nightmare of constantly breaking pipelines.
Learning 4: Now, I define use cases for a new data infrastructure in ways that minimize the number of data sources to connect - especially for more complex data sources.
Learning 5: If possible, I avoid integrating data sources that frequently change their schema and/or semantics. If such integration can't be avoided, data contracts are essential.
Mistake 4: Choosing an overly complex data integration strategy
In 2011, we built on ORACLE databases hosted by our portfolio companies, so we needed to be extremely careful about how we ingested data into the infrastructure scalably.
We couldn't just run full loads every day. With modern cloud data warehouses, the majority of companies starting their first data initiative don't face this problem.
Learning 6: It amazes me how many companies with data volumes that could be processed on a local machine waste enormous time building pipelines using brittle insert/update mechanisms. In 99% of cases, I start by fully loading everything daily.
Mistake 5: Building without decentralization in mind
In 2011, decentralized or hybrid hub-and-spoke models weren't that common - or at least I wasn't aware of them.
We built the data team assuming a central team would handle 100% of data-related responsibilities, from building pipelines to creating dashboards to conducting ad-hoc analyses. This created massive bottlenecks and strained relationships between data teams and business stakeholders.
Learning 7: I'm now a firm believer in hybrid/hub-and-spoke models and would never build a fully centralized data team again. I usually start somewhat centralized and decentralize gradually.
Mistake 6: Building black boxes
Business decision-makers had no way to access unit-level data such as individual order, product, or customer records. We only provided aggregates, like Revenue per marketing channel, date, and user segment.
Since we implemented sophisticated marketing attribution models, the revenues per marketing channel in our data infrastructure differed dramatically from web analytics tools like Google Analytics. Users had no way to investigate these differences. This led to a significant loss of trust and poor adoption.
Learning 8: Selected business decision-makers must be able to access granular data with metrics pre-calculated at the granular level. Only then can they understand what's happening and accept the data infrastructure as the single source of truth.
Mistake 7: Trying to force users away from Excel to use dashboarding tools
We told users they couldn't use Excel for ad-hoc analyses or dashboarding. All analyses and reporting had to be done using the dashboarding tool connected to our data infrastructure.
Our users strongly resisted this approach and avoided using the dashboarding software, despite our significant investment in onboarding and training.
Learning 9: For most business users, Excel/Google Sheets will likely remain the favorite analytics tool for decades to come. We need to accept this and work around the limitations of these tools (e.g., by connecting them to a well-modeled data infrastructure).
If I were to summarize my mistakes, I'd put it this way: I built the data teams and infrastructure with a solution in mind rather than the users in mind.
This approach is fundamentally flawed and will always lead to failure.
Start every data project by deeply understanding your users to dramatically increase your chances of success!
P.S.: Our Data Action Mentor masterclass - Create massive business impact with your Data Team is launching on January 16! The masterclass will dive deeper into each of these mistakes and how to avoid them. The first 100 buyers will get 50% off ($99 instead of $200). Sign up for the waitlist now!
Data Action Mentor Masterclass: Create massive business impact with your data team.
This class is built for ambitious data professionals and data leaders. I spent 17 years building data teams for high-growth companies such as Rocket Internet, Zalando, Takeaway.com, Lazada (acquired by Alibaba), and many more. In this class, I am sharing all my lessons, failures, and successes so that you can make your stakeholders and CEO happy and accelerate your data career.
Impactful Data Teams for Scale-ups
I build data infrastructure and data teams with immediate business impact for global b2c scale-ups and grown-ups in e-commerce, insurance, fintech, and consumer subscription. My proven approach has helped dozens of scale-ups. I build the infrastructure at a fixed price and then empower the business to move rapidly from data to action. If you know a consumer internet scaleup that needs an impactful data team, hit me up!
I run Data Audits to support you in transforming your data team into a strong business partner. You will get a tailor-made list with action items that will help you create massive business impact with your data team.
I am committed to adding actionable, free content to our Data Action Mentor knowledge base to help you on your journey to create massive business impact with your data team
We have received your inquiry and will get back to you asap!
Watch your email inbox for an email with the subject line "Data Action Mentor Masterclass - Create more business impact with your data team".
We will send you updates about the status of your application and about the masterclass launch.
Don't forget to check your Spam folder!
👉 We will be in touch as soon as we're ready to launch the masterclass!
Thank you for your interest in the Data Action Mentor Masterclass - Create Massive Impact with your Data Team!
We will let you know by email before October 30 if you are one of the 3 FREE beta testers.
Please note that the masterclass is still in development. We'll keep you in the loop about our progress in building this!
Best,
Sebastian - Founder Data Action Mentor
👉 Please check your email to confirm your subscription!
⏰ This confirmation email may take up to 5 minutes to arrive (it may land in your spam folder).
You are now on the waitlist!
We will be in touch as soon as we open admissions to the Data Action Mentor community!
🤞 Please don't forget to look for an email with the subject line "Please confirm your email" in your inbox to confirm your newsletter subscription.
I will be in touch once I have news regarding the masterclass!
Cheers,
Sebastian - Founder Data Action Mentor