About Client
Our client is a marketing measurement company which provides a single source of truth for media investment decisions. To make these decisions, the first step is to have the data. This raw data can be fetched from multiple sources through the APIs exposed by the channel. This is where Data Ingestion Framework (DIF) comes into play.
Business Need
- Extract data from different sources with different data schema for different clients.
- Transform – Convert the raw data received from the APIs to meaningful and usable data.
- Load data into a warehouse for analytics and downstream applications.
Challenges
- Integrate different nature of API’s – Different APIs sources can be of different nature like a single API call, dependent API calls, paginated API calls, rate limiting APIs etc.
- Scalability – extensible and scalable architecture to support new client instances with simple configuration changes.
- Guaranteed data delivery SLA
- Ingesting & Cleaning high volume of data – Some endpoints result in large amounts of data which can not be stored in memory.
Our Solution
- Store API credentials– Different types of credentials like access token, refresh token, api key etc are encrypted and then stored in the database. Similarly while making the API call, the encrypted value is decrypted. Encryption, decryption is handled via Amazon KMS service.
- Ingesting data – Ingestion process was smartly built to handle different types of APIs such as paginated api calls, dependent api calls, rate limited api calls etc. For rate limiting redis distributed locks are used.
- Handling data – Raw data received can be in any format such as csv, json, xml, zip etc. First step is to transform the response into JSON format, after that, cleaning and transformation of data takes place. In case of large amounts of data which can not be handled inside memory, streaming is implemented to process data in small chunks.
- Loading data – Once clean data is available, the same can be converted to csv data and loaded into AWS S3 for easy readability and tracking. Same data is loaded into AWS Redshift. This data can now be queried by downstream applications for analytics purposes.
- Deployment – Jenkins is used for CI/CD. AWS Lambda and AWS ECS are used as a runtime environment and are scheduled to run multiple times during a day to perform ETL process.

System Diagram

Business Impact
- Data availability – Data is fetched from multiple sources daily on a timely schedule to have the latest possible data which can be used for analysis.
- Handling high volume data – Ingesting and Processing huge data with ease so that it is not heavy on the system.
- Secrets Management– Passwords and secured credentials can be easily managed.
- Resource Management – All the resources are judiciously consumed with the help of Redis Distributed Locks.
Technology










Tech Prescient was very easy to work with and was always proactive in their response.

The team was technically capable, well-rounded, nimble, and agile. They had a very positive attitude to deliver and could interpret, adopt and implement the required changes quickly.

Amit and his team at Tech Prescient have been a fantastic partner to Measured.

We have been working with Tech Prescient for over three years now and they have aligned to our in-house India development efforts in a complementary way to accelerate our product road map. Amit and his team are a valuable partner to Measured and we are lucky to have them alongside us.

We were lucky to have Amit and his team at Tech Prescient build CeeTOC platform from grounds-up.

Having worked with several other services companies in the past, the difference was stark and evident. The team was able to meaningfully collaborate with us during all the phases and deliver a flawless platform which we could confidently take to our customers.

We have been extremely fortunate to work closely with Amit and his team at Tech Prescient.

The team will do whatever it takes to get the job done and still deliver a solid product with utmost attention to details. The team’s technical competence on the technology stack and the ability to execute are truly commendable.
