Eliminating data fragmentation and siloes to drive AI-readiness
Q2 2022
8 min read
Companies were apprehensive towards AI, leading to long enterprise sales cycles
To drive growth in a seed-stage startup, I led the product & design team to build and launch a data management platform to carve out a new market, eliminate stagnation, and build AI-readiness.
👋🏻
Design Lead
🤝
$4.2m Series A secured
🚀
Launched in 3 months
⬇️
6-figure revenue
Rejecting the solution to a material science problem
In materials sciences, companies often spend lots of time figuring out what to adjust in a material formulation (e.g. higher % of concentrate, or using different compounds). Adjusting formulations is the primary way of creating new or improving materials. This is often long, wasteful, and inefficient.
Polymerize offers a proprietary ML model that uses past data to predict future experiments. It understands how changing one variable would affect the material’s property (e.g. tensile strength). This largely reduces the length of the experiment pipeline, saving companies 50-60% of time and resources.
Yet perplexingly, 7 out of 10 companies who indicated that the model solved their problem were rejecting our pilot projects (free!) and sales calls.
Two-sided problem
Organisation: the flagship product didn’t work out of the box. Company data needs formatting and ML models need fine-tuning. This white-glove process made sales cycles long and uncertain.
Users: data isn’t typically stored neatly. It’s scattered, unformatted, and inconsistent. Long setup time meant that companies perceived a high level of effort and they’re already busy enough, hence perpetuating low AI/ML adoption in material sciences.
It’s okay to ignore this when sales pipelines are full, but this also meant that revenue growth is slow.
❓
How might we drive AI/ML adoption for companies who are less ready for it?
Defining the strategy
With a vague direction, I started from a strategic perspective. While researching on product strategy, I chanced upon Roger Martin’s “Playing to Win” framework, which emphasises answering a cascade of choices that define strategy, rather than “planning” or “envisioning.
🎯
The organisation’s winning aspiration is to augment material science innovation and research with AI/ML.
Where to play
Every prospect that rejected the flagship product had some similarities and differences. I charted a quadrant to understand the consumer segments.
How to win
If data, no matter how they look like, can be organised and formatted for AI/ML ingestion, any material science lab can augment their experimentation.

What’s next?
Can this playbook unlock the 70pp of prospects? I had to find evidence.
I first thought about fake-door tests, using Figma prototypes, or creating an MVP. However, all of them took too much effort — the flagship product was still booming and there were new crazy custom features that the team had to build.
I needed something even more lightweight before asking for more investment.
Enter the minimum viable test
I discovered the Minimum Viable Test (MVT) process used by Gagan Biyani, a former of Udemy and Maven. It championed testing key assumptions that must be true, or the business will fail.
These tests can be anything and it doesn’t have to be an app or product! All it asks is for an assumption to be tested with an atomic unit test. Then, rinse and repeat for each critical assumption.
Assumption 1
Can data be parsed easily in a scalable way?
If parsing is difficult and unscalable, the value proposition cannot be delivered.
I devised a Wizard of Oz test by asking an existing customer to upload spreadsheets of data into a client-side uploading UI on the flagship product.
It promised to parse their spreadsheet and format them accordingly in a few days.
Here, the test is on execution. The truth is, it was my ML engineer and I labelling the data manually.
Due to NDA, the spreadsheet template cannot be displayed.
Result: possibly but it needs experts
Users didn’t trust us...
We took 2 days to label their data which is inefficient:
Not being scientists meant lots of mislabeling
Users didn’t know what we were doing with their data as we told them it was “algorithmic”
Can data be parsed easily in a scalable way?
The experimentation process is universally consistent. We noticed that many columns of data can be renamed appropriately to fit the ML model’s parameters.
Continuing the MVT
The next assumption is critical: will users pay for it?
Assumption 2
Do scientists want to organise and manage their data centrally, and pay for it?
It’s two assumptions in one: data centralisation and being revenue-driving.
A database that centralises their data with project management functions that mirror how labs manage their work.
I recorded a product demo to past prospects.
Result: yes!
60% of prospects booked sales call
Lots of prospects were quick to respond to the email demo and subsequently became new leads once more.
6-figure projected subscription revenue
After 2 weeks’ worth of sales calls, we had a new bottom line.
Executive investment
With this potential revenue driver, the executive team relaxed the work on the flagship product. I rallied a team of superstars to drive the initial phase.
Caveat
This assumption was tested with less caution: I had strong evidence of companies having issues with data and project management.
Product-market fit
With some level of success in the initial sales calls, I defined the rest of the requirements so that the team can deliver within the next 4-6 weeks.
Data upload and mapper
With a spreadsheet template, companies can format their data first before uploading to the platform. However, the algorithm behind the scenes can deduce what data is contained by the detected columns and rows to a fair degree, which already reduces the bulk of work to format data.
Formulations from the uploaded data can then be filtered out with a customisable, verbose filter.
Users can view formulations (which are ingested in the ML model) as individual formulations. A highly-customisable, verbose filter is created to help make finding complex formulations easier
Project Mananagement
Business stakeholders and scientists can now track projects statuses, create various views, and customise project perspectives to either Kanban or a Gantt chart format.
File & Report Storage
Data is centralised and stored in one place.
Conclusion
Impact
Series A funding secured
6-figure subscription revenue within 3 months’ of launch
It was a smash hit: more than half of prospects converted into first and second set of free trial users. Today, Polymerize Connect is live, iterated on, and exists as the only product in the company that operates in a SaaS model.
I had to make peace with many usability issues at the beginning, focusing on product-market fit. Nonetheless, by keeping features simple and focused, it was easier to fix such issues.
This project was extremely satisfying as I managed to expand the target addressable market and created a new stackable business model. I also successfully led a product, design, and tech team.
By focusing on “recycling” and using design systems, the team became proficient in prioritisation. I co-designed many features that were using the same parts as others from the flagship product. In fact, the entire product is just another branch of the flagship!
Regardless, this was extremely challenging as the product was built along day-to-day responsibilities. Energy levels and motivation were diminishing over time. Besides rallying the team, I had to self-manage and ensure that the team continued moving along.
Being ready for AI
Uploading and centralising data was really all for the flagship product, which has a 100-200x larger revenue potential than the subscription model. With companies having spreadsheets on the platform, I enabled an AI function to allow users to try the function of generating experiments with their data.
The limited number of tries eventually proved useful as users enquired about the MLaaS function that Polymerize Connect does not have. These users were then onboarded on the flagship product.
© 2026 Andy Chan







