Introducing Hugot!
Hugot is an open source library to run machine learning models in GoLang (Go) at scale, particularly transformer models (think LLMs!). Our Head of R&D Riccardo Pinosio spoke about the why and the what of Hugot with the folks at Cup o’ Go (find the podcast here: podcast). In this blog post we summarise the key points from this conversation, representing the work we are doing at Knights Analytics to contribute to a more open AI ecosystem.
What is Hugot?
Hugot is a library designed to make it easy to integrate transformers and LLM capabilities within Go applications. Its primary goal is to bring the power of Huggingface transformers into the Go ecosystem, enabling developers to leverage state-of-the-art machine learning models in a performant and scalable way.
The Genesis of Hugot
Hugot emerged from the practical need of integrating more AI capabilities into our core product: Alchemia. We had been looking to integrate transformer models in our core engine in a way that was scalable and stable, but Go lagged behind both Python and Rust in terms of production-worthy machine learning libraries. Hugot was developed by Riccardo and our CTO Rob Keevil to bridge this gap, allowing for seamless integration of transformer models in Go.
Understanding Huggingface and Transformers
Huggingface is a pivotal repository for open-source machine learning models and is the industry standard in open source LLM. It provides a wide array of models for tasks ranging from image classification to natural language processing (NLP). The transformers library from Huggingface is particularly renowned for its ease of training and performing inference with transformer models.
Challenges and Solutions with Go
Go offers significant advantages when it comes to performance, concurrence, and stability, making it ideal for processing large streaming datasets. However, integrating AI capabilities natively in Go applications proved a challenge with the current ecosystem. Hugot addresses various shortcoming of existing solutions by using bindings to ONNX (Open Neural Network Exchange), a format and runtime developed by Microsoft for high-performance machine learning inference.
The Benefits of Hugot
Hugot brings several benefits to the table:
Concurrency and Performance: Go's strengths in handling concurrent processes make it efficient for large-scale machine learning tasks.
Seamless Integration: Hugot allows developers to easily integrate machine learning models directly into Go applications, eliminating the need for separate Python microservices and allowing for local deployments (e.g. behind corporate firewalls).
Open Source and Community-Driven: Hugot is open-source and welcomes contributions, making it a growing and evolving project.
Getting Started with Hugot
For developers familiar with Go but new to machine learning, Hugot offers an approachable entry point. The typical workflow involves:
Selecting a Model: Choose a pre-trained model from Huggingface and export it into ONNX format (e.g. using huggingface optimum). If the model is already in ONNX format, Hugot can optionally download it for you!
Implementing in Go: Use Hugot to load your ONNX model and run inference with your required pipeline type (for instance to extract entities from text) and perform inferences on input batches.
Check out the repository and instructions here: https://github.com/knights-analytics/hugot
Future Directions
Currently, Hugot supports inference with pre-trained models. Future developments aim to include training capabilities, enabling complete machine learning workflows within Go. Contributions from the community are crucial to expanding Hugot's functionality, particularly in implementing additional pipelines and enhancing documentation.
Conclusion
Hugot represents a significant step forward in integrating machine learning capabilities within Go applications. By leveraging the power of Huggingface transformers and the performance benefits of Go, Hugot offers a robust solution for developers seeking to incorporate advanced machine learning models into their systems.
For those interested in contributing or learning more, Hugot is available on GitHub, with comprehensive documentation and a welcoming community eager to advance the project further. Big thanks to those of your already contributing!
Comments