While my career has spanned roles in software development, business, marketing, and product management, the constant has always been data. Now I help companies define, gather, and validate the data they need to build AI solutions.
My diverse set of experiences is now an asset to my clients, who need someone to help them navigate the software, data, and human challenges of building quality data. Whether you plan to crowdsource your data, work with data vendors, or set up your own team, I'm here to help you navigate the process.
A large enterprise needed a new solution to manage the process of annotating various types of data. They used data annotations to measure the performance of algorithms and ML models over time as they made changes. Annotations were also used to train new models.
They needed to build a new system that would allow them to distribute the work to a team of in-house annotators located overseas, as well as to vendors and a crowdsourcing service.
We were closely involved in the technical architecture, defined key aspects of the product, and developed the UI design. In addition, we were responsible for building most of the task interfaces annotators used and developed the visual language that was used across all task interfaces.
The resulting system allowed the team to rapidly scale up the number and types of annotations the team could handle.
A large enterprise used data gathered from customer websites to better inform their sales team and wanted to expand the data provided to their sales organization.
We developed a multi-stage data pipeline that would leverage crowdsourcing (Amazon MTurk) to have crowd workers review aspects of each website, extract key details, and structure it appropriately for use.
Subsequent work explored various AI/ML approaches to further augment the data pipeline and reduce the reliance on crowd workers. In addition, their existing web scraping process was augmented with a crowdsourcing solution to capture content that would otherwise only be accessible by interacting with JavaScript and CSS on the page. This significantly increased the coverage and data retrieved for customers.