Skip to content

ar0it/MetaGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

89 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

metagpt

Microscopy and other image-generating techniques produce metadata often in proprietary formats, presenting challenges in processing, publishing, and long-term storage of scientific data. While Bioformats, a widely used Java library, aims to address these issues by converting proprietary formats to the open OME format, it faces limitations due to its static nature and struggles to keep pace with evolving file formats.

This thesis explores an innovative approach to metadata transcription inspired by recent advancements in large language models (LLMs). LLMs have demonstrated remarkable capabilities in language translation by developing a conceptual understanding that transcends specific languages. Applying this principle to metadata transcription, we propose a method where input metadata is first abstracted into a conceptual representation before being transcribed to the desired output format.

To investigate this approach, we introduce metagpt, a library for building and testing AI systems for automated metadata transcription. We evaluate five different prediction networks utilizing GPT-4 instances, designed to capture and structure metadata not natively supported by the OME data model. While our initial results did not outperform existing methods, this study lays the groundwork for a new paradigm in metadata management.

Our findings highlight both the potential and current limitations of AI-driven approaches in scientific data management. We discuss the implications of these results for the microscopy community and outline future research directions. Despite initial challenges, we argue that a fully realized "OME GPT" could significantly impact the field, potentially unifying metadata management efforts in microscopy much as the FASTA format did for genomics.

This research contributes to the ongoing dialogue about AI applications in scientific data management and opens new avenues for addressing long-standing challenges in metadata standardization and interoperability.

Set Up

To run the code, you will need to have an OpenAI API key set in your environment. Please look at: https://platform.openai.com/docs/quickstart for a tutorial on setting up the key. Please make sure to install the correct package versions as described in the .yaml, since the syntax in the OpenAI library has changed recently.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published