diff --git a/404.html b/404.html index 4f062394..84f9c657 100644 --- a/404.html +++ b/404.html @@ -1,7 +1,7 @@
Breaking into the mainstream music world is incredibly difficult, especially given the proliferation in recent years of so called “bedroom producers” — artists making music on their laptops at home. Labels are constantly on the lookout for new artists to sign, but also material that they can use for the big artists they’re already working with. This pipeline has historically been fractured and inefficient, with no good way for smaller artists to get noticed, or for Artist & Repertoire (A&R) teams to find them. Enter Beatclub.
Timbaland and his team came to us with a vision: Build a platform to connect not only the world’s beatmakers & bedroom producers with major labels, but also to give artists of all kinds powerful tools to collaborate and license their music.
There are times when you need to start vibing before you start thinking. We got straight into finding the right look and feel, starting with the heart of the platform for exploration – Home – where we knew all the information and actions were going to converge.
Once we had a visual direction it was time to chart out the core building blocks. The system had to be modular in order to ensure future growth potential and scalability. Unlike building a house, a digital product requires simultaneous design of functionality and its supporting building blocks. By revisiting this process throughout the course of the project, we ended up with a robust, scalable design system expressed in a consistent visual language.
Through the course of many conversations, the Beatclub music player emerged as a central defining feature of the experience. It needed to evolve beyond a simple listening tool, packing features to help producers screen, save, annotate, and build upon existing tracks. A few key features are:
An audio waveform provides a visual way to assess the landscape of the track even before it starts playing. This helps listeners judge where key moments like intros or drops begin and end.
Tracks can be marked by their creators with different sections like chorus, hook, and bridge. Listeners can skip to these markers, or set their player to always start from a certain point, letting them quickly scan through material.
Producers think about pace in terms of BPM (beats per minute). Changing BPM is a common sampling technique since it can change the energy or feel of a track. Being able to preview BPM changes can help producers evaluate if a certain track or sample will work for what they have in mind.
The clip tool allows users to save a specific portion of a track, or quickly set up a loop to jam along with if something catches their ear.
A built-in multi-track recorder allows artists to go from idea to demo in record time. This surprisingly powerful tool is essentially a mini audio workstation in your pocket, complete with effects and arrangement capabilities.
Notes allow users to jot down their thoughts on a track as they come. Slide the note along the track timeline to save it at a specific timestamp.
When it comes to navigation, consistency is key. We divided most layouts into three common sections: primary navigation to the left, user content to the right, search, filter and page content in a large central area.
A common problem that we see in some platforms is the search silo. When you enter a search term, you’re sent to a bespoke search results page that’s disconnected from the rest of the platform. If you want to refine your search, you have to start again with a new search. Fortunately, there’s a solution.
We like to take an approach where search terms are treated as simply another type of filter. The active filters area suggests common filters based on what the user has already selected. The search box will suggest known filter types when possible, or will convert a search term into a generic filter chip if not. This lets users mix and match search and filter to refine their results as they go, or go in another direction completely. It also allows us to avoid crafting one-off search results pages for each type of content, saving development effort and everyone’s sanity along the way.
A modular, component-based approach helps encapsulate responsive behavior within the core building blocks. That means when it comes to designing for mobile, half the battle is already won. The other half is adapting the navigation to suit the mobile form factor.
If a platform needs a lot of content to be successful, getting it on there better be easy. Good forms should accomplish two things: They should be easy for first time users, and fast for repeat users. The result? A streamlined process that respects users' time and energy.
We started by designing an upload queue where users can dump their files all at once. An intuitive progress tracker gives a birds-eye view of what’s left to do.
Batch controls mean experienced users can upload large sound libraries and multiple tracks, then fly through processing them. Expandable option groups keep a minimal footprint until needed.
The upload process is broken into clear sections that users can jump between with a click if needed. Progress is always saved automatically, with a summary of each completed section appearing when it’s collapsed.
The connection between label A&R teams and the vast untapped resource of bedroom producers has always been tenuous at best. Even when A&R teams can get music submissions, they don’t have good tools to manage and sort through them. To solve this problem, we created a first-class experience to connect the two.
Every producer on Beatclub has the chance to get a VIP invitation to the opportunities portal where they can submit work for consideration by labels representing major artists. It’s a perfect scouting tool for the labels, and an amazing opportunity for producers to get writing or production credits for top-level artists, and real compensation for their work.
For A&R teams, getting submissions is half of the equation. Sorting through them is a different story. We built team functionality for labels that lets them invite members and set permissions. Team members with access can review submissions in a collaborative and open way that keeps everyone up to date, team and artist alike. When a piece of music is chosen, licensing can be finalized directly on the platform. Mutual access between new talent and labels has never been this easy.
Every free-to-use platform faces the inevitable question: What does our premium tier look like? We wanted to take advantage of the nature of the platform to create a premium experience that would be engaging and rewarding to buyers: Beatclub Pro.
The first component of the Pro subscription is the Subscriber drop. Every month the Beatclub team collects three exclusive beats and one soundpack from top producers and posts it to the subscriber drop page. Subscribers can choose to redeem one item from the list as part of their membership.
This is where things get fun. Subscribers get to upload their work as part of the monthly challenge. When judging closes for the month, the winning tracks are featured on the home page. Here’s the best part: the submissions are available for sale to be licensed and used elsewhere. By taking advantage of the marketplace framework already in place, Beatclub Pro creates a virtuous cycle where exclusive access and recognition aren’t the only things up for grabs.
Task and app-switching consumes untold hours of time daily. Since user-to-user messaging was already a requirement for the platform, we extended the feature to include group messages. Ideal for A&R teams who spend their day on the platform anyway, and who want to take full advantage of the built-in sharing tools. Groups aren’t just for work though, they also allow platforms to foster community within their user base.
Let’s face it — analytics are cool, but they can also be pretty dry. In an effort to create a positive feedback loop for new and veteran users alike, we designed the live feed panel. This lets users see engagement with their content in real time, letting them feel the action happening on their profile, and encouraging them to grow their music hustle further.
For the launch event, our team was tasked with creating an animation that would be displayed on a loop in the gallery, alongside the in-house modules and prototypes.
With a working title of 'Dream Modules', our goal was to present Framework's potential in helping people and organizations truly build their own custom modules with limitless possibilities.
To achieve this goal, we decided to avoid creating realistic renders that would limit the viewer's imagination and suggest a fixed, predetermined outcome. Instead, we decided to give the animation an illustrative and magical quality. This approach allowed us to convey the technical aspects of the laptop's modular design in a clear and accessible way, while also leaving ample room for viewers to visualize their own unique module ideas.
The input panel on the Framework Laptop 16 is designed with two rows – top and bottom, each capable of accommodating three sizes of modules. Each row has its own insertion and release mechanisms. While all modules within a row use one mechanism, we needed to maintain this consistency with 15 unique modules while making the system adaptable through revisions. To achieve this, we developed a base animation for the insertion and release mechanisms using Blender. We stored these animations as actions and utilized the Non-Linear Animation (NLA) editor to structure and apply the animations to multiple objects centrally.
To maintain a sense of continuity throughout the animation loop, we designed a display visualisation that spanned the video. Using Blender nodes allowed us to exercise complete control over the animation's speed, length, and visual aspects. For this purpose, we utilized a wave texture with black and white color that was run through the display object. By linking the wave's phase with the animation time and running a color spectrum through the white of the wave, we were able to create the fluid motion seen here.
Following the path we took for the display animation, we took a similar approach for the RGB colors at the back of the keys. Instead of running the color spectrum through the white of the wave, we directly applied it to the shape of the wave to achieve the fast-changing RGB colors. This technique ensured that the keyboard animation remained consistent with the display animation.
We created the 2D graphics for the LED matrix outside of blender and ran our wave visualisation through the white dots of the matrix, while the graphics texture ran a marquee loop.
A still from the Framework event on March 2023.
Instagram, TikTok, and YouTube are fantastic places to build an audience, but creators who do quickly discover that they have relatively limited options on those platforms when it comes to monetizing the following they’ve built.
Routinr is a place where creators can share and monetize the daily routines that helped them become who they are. These routines can be anything from workouts to meal plans, skincare to business productivity, and everything in between. This model puts creators in control, and gives audiences a more meaningful and interactive way to engage with creators they love.
We were brought in to address a glaring problem. The existing routine creation flow was practically unusable, leading to a horrendous drop-off rate and creators not returning to the platform. This was obviously a problem, since you can’t have a product that sells routines to users if there aren’t any routines available.
Reusability and modularity are non-negotiable features of our process. We design and build with an atomic design approach. Just like the world around us can be progressively broken down into repeated molecules and atoms, all interfaces can and should be composed with well-defined reusable components. Thus ensuring future growth potential, scalability, and consistency.
Tokens (AKA variables) are an indispensable tool in the dev world, and we’re just now starting to see them creep into design. We applied this methodology to our color palette to create a truly flexible system that takes a simple set of hue values as input, and generates a full color palette that we can tune, refine, and adapt as needed. This has been a huge help in the development handoff process, especially when it comes to managing changes. It also lets us pull off another cool trick. More on that later.
We’re always on the lookout for hierarchy and content differentiation. Getting these right helps new and veteran users alike accurately digest what’s on screen. Our system automatically grabs colors from activity thumbnail images, and uses the resulting palette to individually theme the cards. This helps set activity cards apart from other content, and it also looks pretty cool if we may say so ourselves…
Sometimes we sound like a broken record going on about how important a flexible foundation is. Well look no further than this project to see why. Remember how at the beginning our job was just to rescue the existing routine creation flow? After we succeeded there, the founders came to us with their phase 2 vision ahead of schedule: A major upgrade to the activity system that would unlock a whole range of awesome features that weren't previously possible.
Technically speaking this only changed one part of the system, but it rippled through the rest of the product in a big way. Having a flexible system in place allowed us to quickly prototype multiple ideas. We actually ended up with a solution that neither us nor the founders expected going in. Only doing the work and iterating through the process in an efficient way allowed us to get there.
The phase 2 upgrade went way beyond the creator side of the platform. It also meant rolling up our sleeves and digging into the as yet untouched user side of things. This restructuring left almost no page untouched, and it gave us the opportunity to apply the new design system to more areas. The product that emerged in the end is sure to be more useful and worthwhile to creators and users alike. It’s also a more confident, modern, and consistent representation of the brand as a whole.
As we mentioned before, you can’t have a marketplace if it’s got nothing on offer. Let’s take a look at some of the quality-of-life improvements that creators got out of this deal.
The original routine creaton flow was essentially a list of activities. That might have been okay except for the fact that it was actively hostile to use, and people were reliably abandoning it before finishing their routines.
We completely overhauled this system and gave people what they’re used to: a calendar! The new system does all the things you’d expect. Dragging activities onto the grid view from the library, picking up and moving things, batch deleting, setting activities to repeat, changing the start and end points of a routine, and more. All a breeze with the system.
The calendar is only one part of the routine-building process. Creators also have to add titles, tags, descriptions, media, and more. Whether you go straight through or bounce all over the place, everything is handled smoothly with confirmation at every step that the work is saved.
The previous system didn’t give creators any way to save activities or see all of their content in one place easily. Needless to say, we weren’t cool with that! Now clicking on the aptly named “My content” lets users view, sort, filter, manage, and just generally admire their glorious work. Of course it also works just as well on mobile as it does on desktop.
Creators aren’t the only ones having a good time at this party. The user side of things also got some major attention.
\n As you might be able to imagine, activity cards are pretty much everywhere on Routinr. The new card design looks familiar everywhere it appears, but it’s flexible enough to adapt to all situations, showing relevant information for each context across web and mobile alike.\n
Remember that color token system from before? It played a key role because the brand palette changed a few times over the course of the project, and each time we were able to quickly adapt it without tearing our hair out. It also made it easy to implement the only feature people really care about when it comes to color: light and dark mode! Flipping between the two is almost as easy in the design file as it is in the developed product—just like it should be.
Users got a brand-new browse and search experience, starting with a refreshed and snazzy home page. Getting to the content you want is now much easier with the revised search flow that lets you pick content types on the fly.
The new routine purchase flow has a little bit of fun and a whole lot of usefulness sprinkled on top. Starting with a new purchase confirmation animation, and ending with a handy way to sprinkle those new activities into your calendar.
Creators aren’t the only ones who need a good way to sort through their stuff. Users also get a first-class browsing experience for all their purchased content, no matter if they have 1 routine or 100.
10i Commerce (now branded as ShopX) provides a B2B retail platform that enables retailers in non-urban centers to carry virtual inventory. 10i Commerce handles logistics and payments while helping retailers with better customer retention by taking orders for SKUs that are not on the shelf.
We helped 10i Commerce customize Spree to be compliant with Indian tax laws.
During our engagement, we set up test cases for the earlier code that was implemented and created a CI/CD pipeline with Jenkins. We addressed performance issues and optimized their code, while also scaling Spree to handle over 100k orders.
Our consultants were instrumental in coaching the existing tech team at 10i Commerce on good development practices like crafting clean SOLID code, writing and implementing unit tests, and using CI/CD to automate deployments.
We also conducted technical interviews for growing their team.
Abridge is an app that helps patients better understand and manage their health. Built for both iOS and Android, Abridge securely records conversations between users and their medical providers, provides audio recordings and transcripts, and delivers intelligent insights on key medical terms and takeaways.
The team at Abridge approached us to augment their internal tech team, accelerate their product roadmap, and leverage our deep experience with native development.
Complex medical jargon is hard to follow, and gets in the way of doctor-patient conversations. Patients are split between listening to the doctor and taking vital notes in parallel.
In a high-pressure clinic environment, it’s easy to miss the minor details - information that can be crucial to a patient’s well-being and overall treatment plan.
Abridge bridges this gap by recording interactions both at the doctor’s and over the phone.
With audio, Abridge accurately highlights key medical terms, including symptoms, conditions, medication and procedures so that users can revisit their conversations with the doctor, understand their data better and stay on the same page.
The Abridge project was a unique technical challenge primarily because recording and encoding data in both iOS and Android was a complex process with poorly documented relevant APIs.
Our team also anticipated potential issues with handling multiple edge cases while streaming the audio to the server.
The Abridge app was built on React Native, which lacked straightforward solutions for audio recording and encoding. This created a need to bridge the framework to SDKs to support the feature.
The app also had to run advanced audio processing while streaming a recording to the server, which was an added challenge since it involved seamlessly executing multiple functions like compression and integration with web sockets.
Our experts exhaustively researched the latest design patterns and best practices on React Native to ensure that the app was up-to-date with the latest technologies.
The initial research was focused on diving into old developer documentation and sample code, especially while writing the new native module for audio functions.
We also studied the libflac library (written in C) in detail to implement it in Java.
As consultants, we handled the complete app launch for iOS including setting up the Apple Developer Account, internal testing using TestFlight, creating the App Store listing and submitting the final app version for review.
We were also involved in crafting parts of the app user interface (UI) using React Native, including an ongoing v2 redesign of the complete product.
To improve the user experience, we engineered the native recording libraries to effectively handle interruptions like phone calls while recording, both on iOS and Android.
In the initial version of the app, the audio was uploaded to the server at the end of the recording. Abridge decided to add value to their users by providing key insights and transcripts in real-time, by enabling dynamic streaming and processing.
Our team built native modules from scratch for both iOS and Android to leverage the advanced audio processing capabilities on each platform (AVAudioEngine on iOS, and FlacEncoder/MediaCodecs on Android).
As the primary engineering stakeholders, we help Abridge with bug fixes and maintenance.
Caratlane is one of India’s biggest omni-channel jewellery retailers, spread over 36 cities in India with more than 92 stores in its network.
Ranked by Dataquest, JuxtConsult and SapientNitro as one of best e-commerce stores in India during the digital boom in 2011, Caratlane has pioneered mindful and usable design for almost a decade - and it was in this same quest that our paths crossed.
Caratlane wanted to streamline and optimize their checkout flow, but that wasn’t the only challenge. The team wanted to build a composable checkout system as a proof of concept, so that they could deploy similar approaches for functions across the store.
Our challenge was to imagine and implement the checkout module as a Single Page Application (SPA) for Caratlane, with a composable structure.
Caratlane was facing attenuation in their checkout conversion rate, with unexplained drop offs during the transaction cycle. This was something we wanted to deep dive into, in an effort to explore and find potential solutions.
Operating at the cutting-edge of usability was also extremely important for us, so we set out to modernize as many elements and interactions as possible for improved UX.
Our primary challenge was to build progressive changes and systems on their existing codebase, especially since the back-end wasn’t designed to allow RESTful authentication endpoints that our SPA needed.
During our research, we noticed the absence of a robust Unit Test Framework and Continuous Integration (CI) Systems, both of which we implemented during the course of our engagement.
For the CI pipeline, we set up a Jenkins environment with Blue Ocean integrated. For unit testing, we set up the Jest framework.
Here’s what we handled for Caratlane.
During our engagement, we built a separate backend module powered by Mithril to extract the auth token, so that it could be exposed as a REST endpoint for other front end components.
This improved the performance and efficiency of the checkout functionality, making it incredibly user-friendly, and paving the way for similar modules to be built for other aspects of the store.
Our checkout module was also built for enhanced usability, with added responsiveness across multiple screens and devices for a uniform experience.
Our Jest unit tests were designed to test the newly created UI components and elements, so that they could be refined over time.
The Caratlane product team was incredibly happy with the speed of our checkout system and the instant rendering of page components, marking yet another milestone in our long journey with them.
Their tech team quickly adopted our Continuous Integration system, and started setting up their projects in Jenkins.
After our tryst with Caratlane, unit testing became a core part of their development and product release cycle.
Gaea, a global pioneer in inventory and warehouse management and supply chain execution, approached us with a unique problem.
How do you create a collaborative labelling system that redefines enterprise labelling, and keeps manufacturers, buyers, suppliers and customers on the same page?
Cloud Label Service was the solution.
The team at Gaea had envisioned an innovative, centralized and flexible supply chain labelling system, and we were tasked with the interesting challenge of building this complete system.
Enterprise labelling is the lifeline of every supply chain.
In a highly intricate logistics network with multiple stakeholders and outsourcing units, the lack of standardization in labelling can be expensive and damaging. Incorrect labelling is one of the primary reasons behind:
This particularly impacts low-margin retail supply chains. It also affects customer loyalty, triggering complications with reverse logistics.
After a deep-dive, we quickly realized the gaps in the current system.
Enterprise labelling was disconnected, with too many touchpoints for label distribution across the supply chain and no central quality control systems.
Companies followed different labelling systems, with their own disparate infrastructure and with limited supplier communication or collaboration.
labelling tools built before CLS were created to prioritize authoring and design rather than complex, rule-driven label generation.
There were no strong digital solutions that could integrate SCM, cloud and on-premise applications to improve efficiency.
After robust research and feedback from multiple internal teams, we focused on a strong rules engine that could support complex computations and dynamic requirements based on the brand, division, and product line.
We also outlined some additional features.
Gaea needed a collaborative platform used by multiple enterprise divisions, third-party suppliers and external stakeholders for greater operational efficiency.
Barcode and labelling standards vary across companies, and the system had to accommodate complex computations based on the brand and product line.
Since users could directly create and print labels, stringent data security measures had to be deployed in line with the expectations of multinational companies.
CLS had to seamlessly integrate with leading SCM systems and cloud systems for improved label lifecycle management.
CLS had to support diverse processes, driven by varying customer behaviour. Some customers enter their data manually, while others need auto-population.
Usable templates with core fields like product details, receipt, reverse logistics, pick ticket, and license plate numbers (LPNs) were needed to save time.
Together with Gaea, we built Cloud Label Service to bring enterprise labelling into the collaborative age.
Designed for cross-enterprise collaboration and functionality, CLS democratizes label template creation, management, and automation.
CLS integrates with leading SCM systems, external data sources and third party tools like Zebra Designer while supporting both cloud-driven and on-prem label printing, for a pan-industrial solution that is secure, robust and uniquely rule-driven.
Cloud Label Service allows:
During the project lifecycle, we developed some key functionalities:
Here's a look at some primary functionalities.
We built a strong rules engine that could support complex computations and dynamic requirements based on the brand, division, and product line.
During bulk printing from multiple servers, requests are likely to clash. Our print queue sequentially streamlines request bundles based on the printer ID.
After extensive ZPL research, we built a parser that could understand the different fields, convert XML data to ZPL and generate real-time previews.
The digital music distribution industry is expected to grow to $1.68 Bn by 2030, and incumbents and insurgents need agile, scalable, and cost-effective infrastructure to remain competitive.
Our client, a global leader and key player, was struggling with an outdated and inefficient deployment process that hindered their ability to innovate and scale. As their technology partners, we identified the need to modernize their platform engineering and helped them seamlessly manage distribution to 150+ digital partners, including Spotify, Apple Music, Amazon Music, Deezer, TikTok, and Tencent.
Our DevOps consultants successfully migrated their deployments from Capistrano on EC2 to Docker and ECS. We set up CI/CD, automated their infrastructure setup using Terraform, and set up observability dashboards to modernize their infrastructure practices.
This case study outlines the challenges, solutions, and results of this transformative project.
Right off the bat, we found gaps in their infrastructure setup.
Capistrano on EC2 had become inefficient and time-consuming as our client’s infrastructure and application needs had grown.
The lack of comprehensive monitoring and logging tools made it difficult to identify performance bottlenecks and areas for improvement.
The existing autoscaling solution was suboptimal, increasing costs and resource wastage.
The application was experiencing memory leaks, negatively impacting performance and stability.
Our client needed a standardized and streamlined environment for various projects to ensure consistency and reproducibility.
This was our plan of action to optimize our client’s infrastructure costs.
To optimize resource utilization and cost-efficiency, we developed custom auto-scaling scripts using AWS Lambda functions that dynamically scaled the number of workers based on the queue size. In this process, we also collaborated with the backend engineering team to move from Sidekiq to SQS.
This allowed us to set up custom CloudWatch metrics on queue length and visualize them on the dashboard. This approach saved the client approximately 50% of the cost of these workloads, while maintaining optimal performance. A further effort by the backend engineering team to rewrite the service from Rails to Go reduced it to 10% of the original cost.
We began by containerizing our client’s applications using Docker, which allowed for easier packaging and distribution of their applications. This was essential to modernize the tech stack.
Containerizing your workloads enables you to treat your infrastructure as cattle rather than pets. Docker facilitated the transition from Capistrano on EC2-based deployments to the Amazon Elastic Container Service (ECS), a more scalable and efficient container orchestration service. This migration enabled faster and more reliable deployments, simplified management, and better resource utilization.
To replace the existing Jenkins-based deployment process, we implemented CircleCI, a modern and user-friendly Continuous Integration and Continuous Deployment (CI/CD) tool. CircleCI streamlined the deployment pipeline, providing better visibility into failing tests. We also set up CodePipeline to automate the deployment and visualize the value stream map.
We enhanced infrastructure management using Terraform, an Infrastructure as Code (IaC) tool, to automate the provisioning and management of the client's ECS setup. Terraform's declarative configuration files ensured that the infrastructure was consistent, easily reproducible, and versioned, reducing the risk of human error and streamlining ongoing maintenance.
Our consultants set up Amazon CloudWatch dashboards to provide comprehensive monitoring and logging for the client's applications. These dashboards exposed essential metrics and logs, allowing the client to quickly identify and address performance bottlenecks and improve the overall quality of their services. We also set up Grafana dashboards to surface business-level metrics on album deliveries and errors across various stores.\nAddressing memory leaks with jemalloc
In collaboration with our client’s engineers, we identified and addressed memory leaks in their applications by moving to jemalloc, a memory allocator library that reduces fragmentation and improves efficiency. This change resulted in improved application stability and performance.
We maintained golden Docker images for each project to ensure consistent and reproducible environments across the client's projects. These images served as the foundation for all project deployments, containing preconfigured environments and dependencies tailored to each project's specific needs. This standardization streamlined development and deployment processes, reducing errors and inconsistencies while increasing overall efficiency.
Our platform engineering work showcased the transformative power of modernizing infrastructure and deployment processes.
By migrating to Docker and ECS and implementing CircleCI and CodePipeline, we significantly streamlined the deployment process, resulting in faster and more reliable deployments.
The CloudWatch dashboards provided crucial performance insights, helping them proactively address bottlenecks and improve service quality.
The custom auto-scaling scripts for distribution workers led to a 90% cost reduction for these workloads while maintaining optimal performance.
By addressing memory leaks with jemalloc, we improved the stability and performance of their applications, resulting in a better overall user experience.
By maintaining golden Docker images for various projects, we ensured consistency and reproducibility, reducing errors and increasing overall efficiency.
By leveraging cutting-edge technologies like Docker, ECS, CircleCI, Terraform, and CloudWatch, we were able to significantly improve the client's deployment efficiency, application performance, resource utilization, and cost-effectiveness.
It’s important to stay agile and innovative in a rapidly evolving world, and adopting best practices isn’t an option anymore - it’s a necessity.
In the era of rapidly evolving technologies, numerous companies, inspired by the hype surrounding GenAI, aspire to integrate it into their operations. However, many find themselves at a crossroads, unsure of how to take the crucial first step.
This case study sheds light on how Tarka Labs assisted GAEA Global, a leader in supply chain solutions, in identifying the most suitable GenAI solution for their domain specific needs.
Maintenance logs and audit reports come in different formats and are paper based. These reports are filled by hand and are archived for future reference. There are lots of business opportunities like improving operational efficiency if this treasure trove of data is digitized. Given the sheer number of different templates, building a system in a traditional sense is very time consuming and costly.
Our objective was to solve this problem using latest techniques powered by LLMs / ML models in achieving the following
There are lots of ML powered document data extraction tools available and some are powered by LLM based AI models. We started scouting the landscape for best possible options out there and started experimenting with the top contenders in a short span of two weeks to decide on each offering capabilities.
Google Document AI offers document processors that are pretrained ML models which can extract data for a specific set of documents like license card, invoice, etc. For our use case, we have to fine tune a model with a few hundreds of documents for it to extract the contextual data. This, when done, will definitely work, what we wanted was a zero shot training model so that it can work for the different types of documents that GAEA’s customers will have.
Here is the summary of our observations experimenting with different document processors
This processor is a plain OCR implementation that will extract the text from the given image / pdf document without any structure. All the text will be pretty much put together as blobs without any logical separation. Tabular data will not make any sense as it does not identify the columnar groupings. This was definitely not going to work for our use case
This processor is specialized to extract key value pairs from the given document. Which is what we would need for our use case. When we tried with different format documents, we noticed that it is able to extract around 80% of the keys and values correctly. It was also capable of understanding handwritten text and most of the time was able to infer it correctly
Google Document AI also offers Generative AI powered extractors that can identify the structure of the data better. We can also configure the keys to be extracted from the given documents. Even though there is a manual step of defining the keys (to improve the success rate of extraction), we were able to observe that this yielded better results in extracting all the needed fields values from the passed documents
We found that the form parser was able to extract most of the keys from the given document but the custom extractor which was powered by generative AI was able to extract the values better given the keys are already configured. So, we tried a hybrid approach where we glued together both the processors (as they have APIs) and passed our documents through this.
Much to our delight, this approach worked better than using these processors in isolation. But read through the end of this article to find out how this fared in comparison to ChatGPT4.
No Generative AI discussion is complete without comparing it with ChatGPT4 (gold standard as of Jan 2024) and we also were curious to find out how it can perform given that ChatGPT4 is multimodal (it can understand images). So, we created a small script to call its API passing each page of our document as an image to see how much it can infer.
Much to our surprise, We were able to see ChatGPT4 performing much better in extracting almost all the keys and values without any predefined keys extraction and configuration as we did in our hybrid approach.
But, The biggest downside of this approach is that we cannot tweak the output in any way. Meaning, we just have to be happy with what it gives as we cannot nudge it by specifying how to extract some parts of complex table structure or multi column structure.
Based on this activity we did over a period of 2 weeks, We have consolidated our learnings and challenges we foresee in this space in this section. Given the Generative AI space is fast evolving and the understanding of the LLMs is astronomically increasing as time passes, It should be noted that these findings might have a short life span (welcome the AI driven world where the cycle time for innovation is drastically reducing).
While zero shot training approaches (Form Parser / ChatGPT) get us faster to production, We don’t have much control to tweak the results to improve the accuracy. If this data is going to be part of a critical business flow, Then this approach will not work well.
Alternatively, If we can spend time to customize or finetune what we wanted to extract (Custom extractor / Hybrid approach), Even though we spend initial effort to define keys that need to be extracted for the given bucket of documents, its accuracy of data extraction will be higher and predictable.
Fankave was founded in Silicon Valley by an experienced team out of Netflix and Microsoft. Their vision was to infuse magic into brand campaigns by transforming social content into visually compelling stories.
With Fankave, we primarily worked on building highly immersive and creative visualisations for image and video-driven social campaigns. Our team was also involved in setting up CI/CD for their solutions, and collaborated with their Aggregator Platform team on minor enhancements.
With creative visualizations, the biggest challenge is to unify an envisioned idea with sound technical implementations that can adapt and evolve in line with the latest trends.
Our primary focus was to build solutions that could evolve over time, since digital visualizations need to retain the competitive advantage of being fresh, adaptive and cutting-edge, compared to traditional dashboards.
Our team decided to build the visualizations like responsive web pages with HTML, CSS, and Javascript, for greater flexibility in rendering the components and simplified data updates. We picked React for its versatility and ability to refresh only parts of the page.
As in all software development, we had our fair share of challenges from the least expected places when the rubber met the road.
Contrary to laptops and desktops with predictable aspect ratios, digital billboards can be of any size. While some billboards are wide and stretch to 10 X 4 feet, others can be tall vertical strips that span 3 floors. This brought about an interesting challenge of building components that were both super responsive content-wise, and also aware of their siblings to position themselves sensibly.
After careful consideration and testing across multiple screen sizes, we nailed this requirement with the help of @media tags and multiple vertical and horizontal screen stops.
Digital billboards are usually powered by devices that receive slower upgrades to the latest browser versions, and feature limited hardware power compared to the latest laptops. Our smooth animations and latest CSS and Javascript inclusion were not rendered completely in the final devices, but they worked well in our test machines.
We overcame this issue by finding alternate ways to get the same experience in the old versions of CSS and Javascript with lots of help from caniuse.com and their reference manuals.
In a big event with a busy crowd and lots of activity, expecting a reliable network can be wishful thinking. Our solution had to factor this in and was built to work seamlessly and inject new content as and when data access was possible.
We created a local state manager and had a fail-safe mechanism to reuse the existing data and refresh experiences seamlessly with newer data.
Farmstead is a farm-to-fridge grocery delivery startup with a vision to reduce food wastage through AI solutions, for optimized inventory and last-mile logistics.
The team at Farmstead approached us to restructure their front-end and back-end codebase for improved performance, a more immersive and intuitive user experience, and for optimizing their complete backend operations.
Farmstead was a rapidly growing startup, so our primary challenge was to iterate the product development at a much faster pace.
Our team was prepared to Identify bottlenecks and refactor problem areas, while driving feature additions at the same time.
We devised an agile plan to accommodate minor and time-sensitive refactorings, while creating a project structure that allowed all developers to adapt and optimize the app during feature development through opportunistic refactoring (the boy scout rule).
Our primary focus was to establish flexible standards to deliver key features, while ensuring that the changes accommodated different styles and standards in the same codebase.
Since our plan for Farmstead was comprehensive, we adopted a strategic approach that would help us refactor the code in parts, while continuing to build, test and deliver new and improved features.
We began by streamlining the unit and end-to-end tests. This was done by setting up factories and helper methods that could be used for new features.
Our team started with re-organizing the front-end codebase into a robust React application, and set standards for the codebase. We reimagined and streamlined the end-to-end shopping experience for the user.
After initial research, we also integrated Redux into the React app and designed it to be functional, independent and separable from the monolith that it was under.
During the front-end rehaul, we organized all CSS styles into comprehensive stylesheets and enabled consistent CX across the application.
Our consultants initiated back-end refactoring by separating JSON-serving endpoints and HTML-serving endpoints.
We optimized and reduced the overall response time by moving and monitoring all JSON serving endpoints, and studying the responses they were serving.
Building a modular system and refactoring the codebase into separate modules across the application helped us reduce duplication.
We developed and deployed some critical features for the application:
In the high pressure environment of a production floor, finding the right tools can be a daunting challenge. Well, that was till we devised a solution with NFC tags.
GE Digital Services approached us with a need to optimize how tools were tracked on their shop floor, and we developed SmartCrib.
SmartCrib is an intelligent solution that allows floor managers to manage and track the real-time position of tools used in CNC machines and the location of all carts on the production line.
Our team added NFC tags to the tools and drills used in the CNC machines and placed NFC readers on the tools receptacle on the crib and the cart. The data from the NFC tags were fed back to the central system via a Raspberry Pi.
The Raspberry Pi setup also reads signal strengths from Bluetooth Low Energy (BLE) Beacons placed alongside each machine bay.
Using this, we visualized the tool position and its contents through a web app built on Golang and React.
With Genetic Direction, we had our first foray into DNA-driven precision medicine.
Genetic Direction is on a mission to help people understand their bodies better, and learn how their DNA profiles uniquely influence aging, the ability to lose weight and micronutrients absorption. The team at Genetic Direction develops programs that assist individuals in modifying their lifestyle based on their genetic predisposition, and add an important dimension to standard and personalized medical intervention.
The company was created in 2015 by veterans and thought leaders from the wellness and health management industries, with a mission to understand:
With the advent of genetic testing and advanced studies of the human genome, Genetic Direction tailors fitness and health management programs based on genetic predisposition. Their suite of programs cover a wide variety of lifestyle challenges, DNA-driven weight management, healthy aging, and comprehensive micronutrient predisposition analysis.
Users can order test kits from Genetic Direction online and share their samples through mail. The expert team at Genetic Direction carries out a detailed analysis of the user's sample and generates extensive and highly visual reports on the DNA genotype, risk profile, and a roadmap to better health. They also support users with instant access to resources like videos, articles, and recipes.
Genetic Direction approached us with a key challenge. They wanted to build a PDF engine that could automatically generate visually rich and data-driven reports from a user's DNA profile - for a simple and seamless document that could be shared with users.
Genetic Direction faced an unprecedented rise in the customer base, which increased their cost per PDF generation to uneconomical levels - especially since this was a recurring cost. To add to this, their existing PDF solution wasn't optimal, very slow and unresponsive during heavy loads. All requests for changes in the PDF were slow and inefficient, which was a bottleneck for their agility.
We had to find a workaround for these key challenges.
During our research, we understood that the genetic reports needed graphical and dynamic content, based on the user’s profile. It was also important that these PDFs were simple and accessible.
Before their engagement with our team at Tarka Labs, Genetic Direction outsourced PDF generation to an external technology partner. They were planning to build a PDF engine and host it themselves to save their recurring costs from generating PDFs, but the big catch here was that the solution was not readily available and that their templates were quite complex (40 - 50 pages) with lots of dynamic data.
We proposed to build a proof of concept to assess the feasibility of such an engine that would meet all their needs.
After 2 weeks of development (with 2 developers) and constant collaboration with the client on their requirements, we built a proof of concept engine that met all their possible needs for PDF generation.
After validating our model for a standalone PDF engine, we proposed the estimated cost and duration to kick-start the project with 2 developers (one lead and one junior dev). Our team completed all the requirements in 5 weeks with 99% unit test coverage and visual function tests.
Here’s how we enabled automated testing of generated PDFs.
The PDF engine was built to be highly customizable for all kinds of documentation, across users. It's controlled by a JSON file that defines all the sections in order, and determines the templates to be used in each section. All the template names are variable and are controlled by the user data, when the report is generated.
User data also determines the pages that go into each section, and this information is dynamically stitched using PDF generation. Some pages need graphics for greater clarity and readability, so based on the user data, the appropriate chart is picked and placed in predefined parts of the page.
Once the pages are stitched, the engine adds the page numbers in the footer and adds a customized header.
Glydel provides a fleet tracking solution for taxi companies in India.
Driven by a vision to bring intelligent tracking, greater transparency and improved customer service and satisfaction into the highly informal market of cab operators in India, Glydel was envisioned as an IoT-based smart transport platform.
The team at Glydel reached out to us during their growth stage, while they were scaling from managing small fleet operators to big fleet owners who had 500+ vehicles. Glydel was facing performance issues, along with a surprise attrition.
Our consultants helped Glydel optimize their database queries.
We started by driving queries from Google Maps’ bounding boxes and nearby location queries using the btree-gist index, which implements the distance operator (<-<).
We also moved their older jQuery based application and transformed it into a structured React.js-based app, and used Webpack for asset compilation.
This helped Glydel ease out their performance issues in times of rapid scale, and reduced attrition while improving the overall efficiency and experience.
Hotstar is an online video streaming platform that offers a variety of content, including drama, movies, and live sports. It is owned by The Walt Disney Company India and operated by Disney Streaming, both part of Disney Entertainment. Hotstar is one of the dominant streaming services in India, with over 300 million active users.
In their mission to build next generation mobile experience, Internal design team redesigned their mobile app for better user experience and to improve the user retention.
As experts in iOS development, we were engaged to enrich their internal team to make this redesign a reality with sound software development principles.
The mobile application was designed with motion design principles. This means, every component will animate in harmony with other components based on the user interations like scroll, swipe and click. Building an iOS mobile application that is interaction rich with reusable components is not a simple task as it would need proper abstraction and well planned component composition.
We needed to implement both vertical as well as horizontal animations that could be triggered by a scroll gesture from the user. The type of animation in each direction was different - vertically, we needed a parallax animation, whereas horizontally, we needed a pan animation with crossfading and movement offsets.\nApart from the above, we also needed to support a slow panning gesture as well as a fast flick gesture, and also needed to auto-complete the animation when the pan gesture crossed a particular threshold.
Not only did the animations need to work when the user performed a scrolling action, they also needed to run on a fixed timed interval that was determined on the backend. We needed to ensure that the user-triggered and automated animations did not conflict with each other in any way.
To ensure that the animations were smooth without any perceptible lags or hiccups, we needed to synchronize the animation with the device's refresh rate, whether it was 60hz or 120hz on the latest models.\nReversible animations\nWe needed to ensure that the animation could be reversed to the starting position, if the user did not complete the pan gesture fully.
If the user reached the end of the list of items in the masthead on either side, we needed to allow them to loop back to the start or end again.
Given the multiple requirements and challenges, the solution was not as simple as just using the UIView animateWithDuration APIs to drive the animation. Here’s how we implemented it
We used a simple UITableView to host the contents of the home page, including the masthead component. UICollectionView was a no-go, because improvements that would allow us to use it for list-like UI, such as compositional layouts, were only introduced in iOS 13. This also meant that we had to use UIKit, and not SwiftUI as that was also an iOS 13 introduction.
Within the table view, each cell contained a separate component, and the masthead component was one such. The masthead itself consisted of a horizontally scrolling collection view.
Because the root view was a table view, we were able to listen to UIScrollView delegate methods and obtain the scroll offset using the scrollView:didScroll method. The current scroll offset was broadcast as an event using RxSwift, and was listened to by the masthead component. The component then further broadcast it to the current visible cell of the internal collection view, where we updated the auto layout constraints to move the necessary subviews in the opposite direction of the scroll, giving a parallax effect.
scrollView:didScroll
We used RxSwift instead of say NSNotifications, as the former allowed us to expose a type-safe BehaviourSubject containing a CGPoint of the current vertical offset. With Notifications, we would’ve needed to package the CGPoint in an untyped [String: Any] userinfo dictionary, and then cast the value when consuming the notification.
The first thing we did to support the custom animation that we wanted is to disable scrolling on the internal collection view, because we simply did not want the default scroll experience.
Instead, we used a separate pan gesture, and manually transformed the collection view cells and its subviews when the gesture updated, to make it look like the collection view was scrolling during the animation.
Whenever the pan gesture updated, we did the following:\nWe stored the “translation” of the pan gesture, aka the distance that the user has moved using the gesture\nTo ensure that the animation happened without any jitters or dropped frames, we used a CADisplayLink to drive it. When a display link is configured and added to the runloop, it gives us a callback everytime the screen is refreshed. In this callback, we calculated the progress of the animation based on the translation stored above. Since this happens on every refresh, no frames are missed out or dropped.\nWe then broadcast the progress to two individual cells - the current cell being animated out, and the next cell being animated in. They update their subviews accordingly - for example, the current cell would animate out the title to the side, and fade out its poster image. The next cell would animate in the title from the side, and fade in its poster.
Once the animation was complete, we then scrolled the collectionview to the new indexPath, albeit without any animation of its own, so that its internal state is up-to-date with what is on screen.
We used a Timer instance to automatically scroll the masthead on a fixed interval, when the user wasn't actively scrolling it.
When the timer got fired, we simply unpaused the display link, allowing it to drive the animation from start to finish. Instead of using a gesture’s translation to calculate progress, we used the display link’s timestamp during each of its callbacks, to determine how much time has elapsed, and then using the animation’s total duration, we arrived at the progress. Once the animation was completed, we scheduled the timer again, so that it can run one more time as needed.
In some cases, we did not want the timer to fire. For example, if the user was performing some other action such as scrolling the tableview, we did not want to distract them with an auto-scrolling masthead. To achieve this, we added the timer to the main runloop using the default mode, as opposed to the common mode. In the default mode, the runloop is blocked by other events on it - in this case pausing the timer. As an extra precaution, we also added a check to prevent it from activating whenever the pan gesture to manually scroll horizontally was active.
The redesigned UI of our client’s mobile app with a homepage layout consisting of both parallax vertical animations and a custom horizontal scroll transition is now available on iOS. In this day and age, it is critical for mobile apps to stand out with a polished UX that delights users, and we helped our client achieve just that with our implementation of the Masthead component
IDEO is a global design and consulting firm with offices across the world. They are proponents of using design thinking approach to design products, services, environments and digital experiences.
IDEO wanted to build an immersive landing page for their digital product creative difference that is responsive, fast, light weight and works smooth across different browsers.
Our challenge was to build this highly animated landing page and meet the performance goals of rendering it above 60 FPS.
Their highly customised landing page design required animations and rendering of new elements on the screen based on the vertial scroll position of the page. They also wanted to build it without any proprietary software and stick with web standards and specifications.
ScrollMagic is an open source javascript library that provides constructs to trigger different actions at different scroll points. It is lightweight and optimised for performance. We implemented the complex interactions that the landing page demanded using ScrollMagic.
CSS3 transitions is expressive and highly performant way of animating the elements in the DOM that works well across different browsers. The key is to stick to the transition functions that are well adopted across different browsers. Our implementation took care of these concerns and successfully implemented all the interactions and experiences required by the design.
IDExcel provides solutions for the Asset-Based Lending (ABL) industry, and helps them automate the profiling and risk analysis for credit managers against assets like accounts receivable invoices.
Their codebase had been optimized prematurely into microservices without a clear separation. This resulted in domain objects being equally distributed between three microservices that were distinct in terms of the UI functionality, but shared a lot of models.
Our consultants were brought onboard to assist their team in improving the performance. After our initial analysis, we found that it would be better to transform the app into a monolith.
The team had used ‘Her’ (an Object Relational Mapper) to use an ActiveModel like API that could engage with the different microservices. We refactored the codebase to bring the models together as a Rails engine to the main application and rewrote the ‘Her’ model calls to directly use the underlying database tables through ActiveRecord.
This eliminated a whole class of problems with N+1s and drastically improved the performance of the application across the board. We assisted the team in migrating the database as well and transformed the nearly unusable parts of the application to some of the fastest parts of the application.
IFAD is a UN organization dedicated to eradicating rural poverty by providing low interest loans and direct assistance. We are very proud to be working with IFAD on solving these problems.
IFAD's solution team contributes quite extensively in terms of open-source software. We initially conducted an independent audit of their codebase and later worked with them on several internal tools such as MyCalls (a phone call accounting software) and their activerecord-sybase-adapter for making Sybase work well with ActiveRecord 4.2.
Inkl approached us with a great vision.
They were out to build a free news service, and were on a journey to eliminate distractions and clickbait from essential and everyday stories.
The team at Inkl was on a mission to curate the best news stories from respected publishers and 100+ international sources, delivered in a clean and highly immersive reading interface that would delight users.
And we loved their idea!
As part of our engagement, we started with helping Inkl simplify their storyboard.
By dividing their single storyboard into multiple cohesive storyboards, we made them easily manageable and modular, so that Inkl could enhance parts of their storyboard in the future without affecting the rest.
For an immersive reader experience, we helped Inkl beautify the UI and build day and night themes into the app, to enhance the distraction-free interface of the app. In our mission to make the app intuitive and understand user interactions and behaviour, we integrated analytics into the app.
As a special feature, our team worked on displaying the Morning Edition notification, which alerts users at a specific time in the morning with their curated news stories.
Jifflenow is a global leader in cloud-based enterprise meeting scheduling solutions for B2B events.
The platform helps companies accelerate their sales cycle through simplified meeting scheduling, tracking, and analysis.
Trusted by over 60 Fortune 1000 companies, JiffleNow empowers businesses by helping them convert leads to qualified prospects, automate the approval workflow, and get intelligent and timely reminders.
JiffleNow transforms the way companies define success at B2B events by enabling them to capture, distribute and analyze meeting data. This helps businesses assess and compare influenced revenue, and understand their ROI better.
Our team at Tarka Labs has built and delivered their core platform, while also assisting and coaching their internal team in writing good code and following recommended practices in the craft.
Our client was a messaging solution provider. They had a partially completed platform with ruby and microservices, a node.js based engine that connected to the SMS gateways and a campaign management system written in PHP.
The campaign management system did not quite integrate with the rest of the platform. They also had a load distributor system called the balancer.
The old campaigner system was rewritten using elixir and phoenix. We used React.js with Immutable.js and immstruct for the frontend. We achieved close to 90% code coverage on the project with unit test cases and configured CircleCI to not just run the unit tests but to also build a docker image and push it to the registry.
DocsDelivered is a bit like MailChimp or other email marketing tools. Customers can upload their contact lists as CSV files or populate them via an API. They can then setup documents (campaigns). We used liquid and markdown to generate documents which were then delivered via an SMS. The recepient can open the document and download it as PDF if needed or have it emailed to them by providing an email address. We integrated with Google Analytics to provide impressive analytics and visualization. Docs Delivered was written with elixir, phoenix and react.js. As with the campaigner, we used CircleCI to run our tests and to build and push the docker image on to the registry.
There was an existing implementation of the SMS engine that was written with Node.js and had a partial implementation with Go. We finished porting the implementation from Node.js to Golang to support handling incoming SMS from the SMPP gateway. We also reimplemented the distributor with a RabbitMQ based system thus simplyfing the effort needed to build a balancer. This decoupled the implementation of the engine from its consumers allowing us to iterate quickly.
Modus powers the world’s most advanced and innovative vehicle tracking system for insurers, fleet managers, and businesses that rely on superior driver and vehicle performance.
Tarka Labs was part of the team that helped them identify and fix the system's performance issues. Our team built a trip simulator to show their fleet dashboard in real time.
We helped Modus rearchitect their inventory management and the white labeled CRM platform.
As Modus grew, it needed its infrastructure to scale. We helped Modus process and report on millions of trips taken by their customers. We used PostgreSQL to shard the trips database and improve the performance of the reporting infrastructure. As the business spread to multiple geographies, increasing the demand on the reliability of the inventory service, we migrated it from a pure HTTP based microservice written in Rails to a RabbitMQ based system to improve its reliability and handle the changes to the inventory service without affecting the upstream systems.
Our team added unit tests to the codebase and moved it from Rails 3 to Rails 4, and helped set up running tests on CircleCI and automated the deployment with Capistrano. We also setup a build monitor to serve as an information radiator, and helped in turning parts of the API written with Rails and Grape to Elixir and Phoenix.
We then built the UI for their flagship Zephyr fleet management platform.
Our experts used React, ImmutableJS and Immstruct to build a high performance UI that integrates Google Maps, SVG and Canvas for charting, and reports and websockets for live driver tracking. Along with this feature, we built utility apps that simulated driving based on past trips to easily test and demo the user interface.
\nWhat do we mean by \"Product Discovery\"? \nWhen a product visionary wants to work with us to bring his or her ideas to life, we conduct a \"Discovery\" exercise to understand, analyse and validate their vision and translate them into artifacts like screen mockups, epics, and stories with project estimates & architecture suggestions. This discovery exercise usually takes a week depending on the complexity of the idea.\n
What do we mean by \"Product Discovery\"? \nWhen a product visionary wants to work with us to bring his or her ideas to life, we conduct a \"Discovery\" exercise to understand, analyse and validate their vision and translate them into artifacts like screen mockups, epics, and stories with project estimates & architecture suggestions. This discovery exercise usually takes a week depending on the complexity of the idea.
Rahul Juware, founder of Social Labs approached us with a vision to redefine the way recyclable plastic is processed in India. He envisioned a software platform where recyclers, scrap dealers, and manufacturers could collaborate to better track and trace collected waste, and further recycle them.
As with any other engagement in Tarka Labs, we suggested that our discovery team would spend 3 days to get into the details of his idea with the following outcomes at the end of this exercise. We promised to:
Following a go-ahead from the client, we kick-started the discovery phase with Sudhakar (Principal Consultant), Harman (Lead Consultant) and Gopi Raja (UX & UI Designer) as part of the team.
Armed with white charts, markers, and sticky notes, we kicked off an initial discussion with the founder to extensively document his product idea.
We ran focused daily sessions of 2 hours (per day) to discuss, brainstorm, and decide on different functional aspects of the platform.
Being part of a discovery team is at once inspiring and exhausting. Our deep discussions yielded deeper clarifications, and over the course of our sessions, we managed to pin essential workflows and details on the drawing board. So much so that we almost lost track of time and ended up staying atleast a couple of hours longer than planned each day.
At the end of 3 days, our team came out victorious with greater clarity and a stronger understanding of the scope and shortcomings of the envisioned idea.
During the discussions, our consultants constantly took notes and sketched out flows on the whiteboard to create the following artifacts.
Our first step was to understand the goal of the system, drill down into different work flows, finally identify the actors in the system, and their intentions and interactions. We charted these out and improved the goals, workflows, and personas over the course of three days.
Early into our discovery process, we realised that there was a clear need for a web app (for Administrators and Suppliers) and a mobile app (for recyclers and scrap dealers).Based on our defined user interactions and their sub-goals, we were able to create mock screens and design flows that could help users achieve their goals better.
We digitally created mockups at the end of the discovery process, and came up with the below ideas.
Once our user flows were nailed down, we moved on to the next logical step of identifying epics and breaking them down into smaller epics and their constituent stories. As part of this process, we added our best guesstimates (relative sizing of cards) based on T-Shirt sizes (S, M, L, and XL) on the effort involved in achieving these objectives. Cards that were bigger than XL were further broken down with details, to simplify their scope.
From here, it was quite easy for the team and the founder to understand the effort that it would take to bring the product idea to life. Based on the possible parallel development streams, we also came up with a range to estimate the cost of building the product.
Overall, the discovery engagement was executed almost as planned. Rahul was happy with the exercise, as it gave him and his investors absolute clarity on the cost, effort, feasibility, and complexity of implementing the product idea.
Tunecore is a New York-based independent music distribution and publishing service with over 250,000 artists on its roster, generating over $2 Billion in revenue for its artists, and more than 200 billion streams and downloads.
Following Believe Music's (AKA Believe Digital) acquisition of Tunecore, we were involved in making progressive changes and planning Tunecore’s cutover into Believe’s expansive digital music distribution system.
Tunecore and Believe together have a formidable digital supply chain that covers Apple Music, Spotify, Tidal, Amazon Music, Youtube Music, Pandora, Tiktok and over 150 other digital partners.
Our brief was challenging, and involved:
Distribution systems have rapidly evolved with the music industry, and the digital revolution has replaced traditional systems (from the CD era) with digital stakeholders and platforms to bridge the gap between creators and their audiences.
Digital music distribution is incredibly competitive, and optimizing the timeline between mastering and distribution is crucial.
Sample this - over 14.6 million tracks are uploaded to Spotify every year (that’s 40,000 songs in a day) - and most of these uploads are processed through independent music distribution platforms like Tunecore.
Tunecore helps hundreds of thousands of independent artists send their digital assets to diverse digital partners, and publishes their catalog across marketplaces, stores and streaming services online.
But this is where it gets complex.
Digital ingestion systems aren’t standardized across retailers, making it essential for XML or DDEX delivery specifications and the transcoding processes to be precise. And when two of the biggest players, Believe and Tunecore, bring their catalogs and systems together - accuracy is everything.
Our primary challenge was with a backfill process that we had to initiate.
Tunecore had to transfer a backlog of 8 million songs to Believe’s backend, and their catalog albums and tracks had unique identifiers that had to be streamlined and standardized to the requirements of the new system.
In our roadmap for accelerating the transcoding process, we anticipated issues with metadata variances between digital retailers.
The ingestion systems across retailers, marketplaces and services have unique delivery specifications, and this was a challenge we had to take into account while planning our transcoding pipeline.
Our research was extensively focused on studying the current systems used by Tunecore and Believe, in order to explore alternatives that could help improve cost efficiencies and allow the system to scale seamlessly.
We also had to study existing and new metadata standards like XML and DDEX to streamline the product and release data, and make the files easy to access, catalog and transfer.
Since accelerated transcoding was a key component, we explored prepackaged transcoding services like AWS Lambda and Elastic Transcoder.
Our recommendations included AWS Lambda for on-demand transcoding and AWS SQS for queue processing to handle high volume transfers.
To future-proof the system, we also recommended AWS Cloudwatch for monitoring pipelines and scaling, so that it could handle loads of upto 500,000 MAU.
With an eye on performance improvement and complete modernization, our team set to work.
To modernize and drive faster iterations on the overall infrastructure and database design, we extracted and refactored the distribution system from the current application to its own microservice.
Digital music distribution is a rapidly growing space, with increasing YoY growth and escalating demand. We adopted AWS services into the current system to enable on-demand scaling through its supported services/features.
As part of our progressive changes, we also extracted transcoding into its own service. After evaluating AWS elastic transcoding, we settled on using AWS Lambda, which masters WAV files to FLAC using FFmpeg and extracts metadata from audio files through MediaInfo.
Over the project lifecycle, we’ve also driven better observability and improved test coverage from as low as 30% to 80%.
True to our vision, our extractions and implementation drastically reduced the transcoding time and paved the way for optimized, better performance.
We were able to significantly minimize the timeline for transcoding and distribution, helping Believe, Tunecore and their thousands of artists deliver music to their audiences at scale.
In the dynamic landscape of warehouse management, the ability to swiftly access and analyze crucial data can make all the difference.
Tarka Labs, at the forefront of GenAI solutions, embarked on a transformative journey with client name to redefine how organizations interact with their data warehouses. This case study delves into the intricacies of leveraging Large Language Models (LLMs) to create sophisticated chatbots capable of revolutionizing warehouse management.
The warehouse employees generally encounter several challenges that hinder their operational efficiency. These challenges, which prompted the exploration of advanced AI solutions, can be outlined as follows:
This exploration aimed to empower employees within organizations that handle large volumes of data. The goal was to enable these employees to extract precise information from their data warehouse management systems. Natural language queries about incoming shipments, historical sales, inventory status, open orders, and supply forecasting became the focal point.
Tarka Labs aimed to showcase how these AI-driven chatbots could revolutionize the way organizations access and analyze their warehouse data.
Tarka Labs adopted a systematic approach, experimenting with various LLMs to gauge their accuracy, understanding of warehouse concepts, and suitability for handling sensitive data and ability to generate SQL queries.
List of sample questions used for POC and how each LLM fared with it
In the exploration of advanced AI solutions for warehouse management, ChatGPT emerged as a pivotal tool, showcasing remarkable accuracy. Through a systematic two-step process, ChatGPT excelled in responding to 9 out of 10 sample questions.
Its initial step involved identifying pertinent information from the data warehouse management system, leveraging inherent knowledge of warehouse management terminologies. Subsequently, ChatGPT seamlessly crafted queries, demonstrating a nuanced understanding of the warehouse domain.
The implementation further benefited from prompt tuning, where pre-defined definitions were provided to enhance response accuracy, highlighting ChatGPT's adaptability and proficiency in the warehouse management landscape.
The success of ChatGPT in warehouse management queries was attributed to a strategic combination of tools, incorporating agent-based function calling and leveraging the LLM's SQL generation capabilities.
This systematic approach aimed to address challenges methodically, ensuring precision in responses.
To further enhance response accuracy, the implementation incorporated prompt tuning. Pre-defined definitions were systematically provided to the LLM, enabling a more nuanced understanding of the warehouse management domain. This adaptability and proficiency positioned ChatGPT as a valuable asset in the evolving landscape of warehouse management, showcasing its transformative potential in addressing complex queries with precision.
SQLCoder-34B, an open-source Language Model (LLM), was explored due to privacy concerns surrounding GPT4, as our client handles sensitive data. This 34B parameter model specializes in generating SQL queries from natural language and is fine-tuned on a base CodeLlama model, incorporating over 20,000 human-curated questions across ten different domains.
However, SQLCoder-34B faced challenges in its functional performance. It struggled to identify relevant tables for addressing questions, managing to answer only 2 out of 10 questions. Additionally, the model couldn't efficiently break down complex queries to provide solutions.
From a technical standpoint, Amazon Sagemaker was employed to deploy this model. Given its fine-tuning for SQL responses, SQLCoder-34B generates SQL output. Despite producing syntactically correct queries, the model's limitation surfaced – lacking fine-tuning for warehouse-specific queries resulted in logically incorrect queries
Llama2 70B, with its solid grasp of warehouse concepts, was investigated for its proficiency in SQL generation. Employing prompt engineering and few-shot learning, we guided the model through prompts to generate SQL for specific queries. We utilized a pre-trained model without fine-tuning for this.
While Llama2 70B demonstrated adept instruction following, it struggled to identify the relevant information for answering questions. The queries it produced were syntactically incorrect, and it used inaccurate column names within the data warehouse. The model generated hallucinated SQL queries, referencing random columns not present in the prompt.
We tweaked top_p and temperature parameters and the responses were generated accordingly. Despite its shortcomings in SQL generation, Llama2 70B exhibited commendable reasoning capabilities, providing clear explanations of the logic used in SQL generation.
Recognizing the strength of ChatGPT4, a hybrid approach was proposed. This involved blending the reasoning capabilities of ChatGPT4 with an open-source LLM.
This strategic combination aimed to generate accurate responses while preserving the privacy of sensitive data. By exposing only the schema to ChatGPT4 and executing queries in an SQL engine, a harmonious balance between accuracy and data security was achieved.
In conclusion, Tarka Labs' journey with client name underscores the transformative potential of AI-driven chatbots in warehouse management.
Beyond the technicalities, this case study serves as an invitation for organizations to explore the untapped potential of LLMs.
As technology continues to evolve, leveraging these tools to their full capacity can unlock new possibilities and redefine the way we approach data in the warehouse management landscape.
+
A tool that helps to learn a new language through podcasts. This tool provides textual translation of podcast text in both original language and the native language of the user in a side by side fashion for the listener to easily understand the meaning of the statements heard. This product uses OpenAI Whisper (audio to text translation) and flan-t5-xl (language translation) -
A chatbot that can understand large amounts of documents and answer based on the context and the conversation history to the users. This tool is powered by RAG technique. -
A chatbot that can help your customers to pick the right product to buy based on their preferences. This tool uses LLM to find the customer preferences and searching for a match in its vector database of items -
A chatbot that can answer questions very specific to user data by integrating with internal API endpoints of an organization. A good use case is for the customer support chatbot which can give accurate details about their account, transactions, history, etc. This tool is powered by LangChain and API adapters to easily integrate hundreds of APIs -
A tool for software service providers to generate lead gen emails based on the needs of the target organisation and the capabilities of their organisation. This tool is powered by Vector database and sales domain knowledge -
- Tarka Labs consultants are smart, effective communicators and extremely educative. Their quality of code is highly impressive. They’ve built a robust foundation, where reusable components and systems have been our key differentiators. It’s hard to cross boundaries in technology, but they’ve built thoughtful and scalable solutions by working on multiple code repositories with ease. -
+ It’s been a delight to work with Tarka Labs. Their interactions are articulate and consultants get into great development cadence quickly. They have excellent command in backend and web development, and their solutions added value to the schema of business. +
- The team at Tarka labs have exceeded my expectations during our collaboration! Their expertise, coaching, and ability to bring our ideas to life have greatly enhanced our app prototype development and prepared us for the user testing phase of our project. Forever grateful! -
+ Tarka Labs has been a great discovery for Fankave. They helped us build an array of interactive social experiences. Quite impressed with the team’s technical acumen. I'd recommend Tarka Labs if you’re trying to build an MVP, scale your product or at any stage of growth. +
- I’ve worked with Tarka Labs on a platform project with a lot of moving pieces and custom features that took a deep thinking team to bring the idea to life. They are deep thinkers, analyze current solutions, ask insightful questions, and come up with solutions that really refined the ideas into a truly unique platform. They bring open thinking, disruptive ideas, and challenge conventional thinking. Be prepared! -
+ Tarka Labs has thought leaders and subject matter experts. They solve complex problems and build robust well-architected systems, and our conversations are always enriching. They are passionate tech craftsmen abreast with the latest trends. They have definitely exceeded my expectations. +
- Tarka Labs has been a great discovery for Fankave. They helped us build an array of interactive social experiences. Quite impressed with the team’s technical acumen. I'd recommend Tarka Labs if you’re trying to build an MVP, scale your product or at any stage of growth. -
+ Tarka truly is a World-class team of skilled thought designers. +
+ The experience design surpassed our expectations, resulting in an elegant and intuitive product blueprint. +
+ The offline internal collaboration within the team results in the clearest product presentation to the end client. +