Hacker Newsnew | past | comments | ask | show | jobs | submit | sameerank's commentslogin

Hi all, I recently built https://pretrained.convect.ml.

I’ve been interested in the potential for building web apps on top of pretrained models that have been gaining popularity in the machine learning community. One foundational piece for these apps would be access to predictions by these models with fast response times. So, I’ve made models available via an API for a few AI tasks:

1) Text generation (GTP-2, GPT-Neo 125M, and PEGASUS for paraphrasing): Provide text and generate more text with a similar style and content. Use these models to build an AI writing assistant or even synthesize entire articles.

2) Computer vision (CLIP): Measure the association between any sequence of images to a list of arbitrary texts. Use CLIP in an app to detect vehicles, animals, trees, household appliances, or other physical objects that you can describe with words.

3) Conversation (Blenderbot 400M Distill): Build an AI-powered chatbot that responds to user inputs. TBH Blenderbot’s responses don’t always make sense, so I’d be careful with this one.

4) Article summarization (Bart Large CNN): Generate a summary of the salient points in an article. Use this model to build tools to help people consume information faster.

5) Text classification (Bart Large MultiNLI): Measure the association between any sequence of words to a list of arbitrary text labels. A classification model can be used to detect topics, e.g. send customer call center transcripts to this API to detect if customers are reaching out about specific topics of interest, such as product defects or payment discrepancies.

6) Sentiment analysis (DistilBERT base uncased finetuned SST-2): Detect the sentiment of a piece of text. This model, for example, could be used to measure overall customer approval levels for a product from social media posts.

Many of the endpoint response times are sub-second and could be used in applications to provide a near-real-time experience. I have also included CodePens (in React) for each of the models to make it easy to get started with building on top of them. All the endpoints handle preflight requests from any origin, so applications can be purely browser-based if you want to do that.

I’m not the first to make these models available for free over an API but am hoping to make the experience of getting started as easy as possible. I’d also love to chat with anyone who has been curious about using these models in their projects or if there’s a particular model that you wish was readily available as an API endpoint.


Thanks! Super glad to hear that this is addressing a real pain point.

The infra aspect has been on my mind — how to easily onboard users who prefer to use their own infra. Love the idea of selling a CF template on AWS Marketplace. I’ll look into that. Really appreciate this feedback :)


Thanks for the feedback! That’s totally fair. My plan for now is to work with early users to understand what makes sense to them. At this early stage, any price I come up with would be almost a wild guess.


At the moment, Convect only deploys scikit-learn models, but I am planning to support inference with PyTorch and other ML frameworks. Not planning to run on GPUs, as there are other products that already help with this, unless I see interest in this from my users. Convect would support concurrency according to AWS Lambda's quota, which is in the hundreds of thousands.


before the "boo birds" show up, I want to support this move as practical and useful. Real graduate projects using Scikit-Learn plus domain knowledge (gasp) make powerful results regularly. (I have no prior connection to this product)


Yup, this was my thinking too. :)

The variety of data problems you can work on with good old logistic regression (or the other models in scikit-learn) is really quite vast and seemed like a reasonable place to start.


Thanks! AWS Lambda supports Docker image sizes up to 10 GB (according to their docs), so on the back end, 3-5 GB could still work.

Convect, in its current state, is still limited to small models (e.g. < 0.5MB). This is because the deployment happens by posting to a REST API, which hasn't yet been tested for large payloads.

I wrote up some tips for decoupling a large model from the data that's causing it to be large (https://convect.readme.io/docs/news-topic-model). However, it sounds like you're asking about a truly large model. Convect is still in the MVP stage right now, but we plan to handle larger models. 3-5 GB models will likely be feasible in the future.


Thank you for checking it out!


Great question! Yes, Convect supports deploying pipelines and functions in which you can include pre-processing code. Here are two tutorials that walk through that: https://convect.readme.io/docs/wine-classifier-pipeline and https://convect.readme.io/docs/iris-plant-classifier-functio...


Thanks! FastAPI/ColabCode is a fair comparison. And I just tried it in Google Colab, and it does work! You can install Convect in a notebook cell with "!pip3 install convect"


Hi HN, I’m Sameeran. I’m building Convect (https://convect.ml).

Convect deploys machine learning (ML) models to instantly callable, serverless API endpoints. Using Convect, Jupyter notebook users can deploy trained models from their notebooks and share them with the world in seconds. No web development or infrastructure experience is needed. Convect simplifies the process by being more opinionated than other model deployment workflows.

To give Convect a try, visit https://app.convect.ml. You can also try out models without signing into an account on the demo page (https://app.convect.ml/#/demo). I would love your feedback.

Some background/context:

Deploying ML models to be used in production entails a different set of skills than training models in a sandbox environment and can get pretty complicated depending on what you’re trying to do. For many data scientists, this “sandbox” environment is a Jupyter notebook. One common approach to “deploying to production” that I saw while previously working as a data scientist at Intuit, is turning a model trained with scikit-learn in a notebook into an API endpoint.

From my experience, there are a few ways to deploy a model to an API endpoint and all of them involve a nontrivial level of effort and time. Examples of some of the steps in the process include pickling a model and uploading it to cloud storage, Dockerizing a model’s prediction code and environment, deploying a Flask app, or getting set up with an ML framework (e.g. MLFlow) or platform (e.g. SageMaker) so you can use the deployment feature in their SDK.

While complex workflows make sense for deploying complex models, I haven’t seen any dead simple deployment solutions for simple models, and that’s what I am working on building with Convect. In this case, simplicity comes at the cost of flexibility, i.e. you give up the ability to customize your infrastructure and runtime environment in exchange for a simple, one-click workflow. My hypothesis is that this tradeoff is worth it in many situations, and I’m curious to see what people are enabled to build when this aspect of the ML workflow is drastically simplified.

Under the hood, Convect creates two artifacts by serializing 1) the model prediction code and 2) all the variables that are in scope in the Python session at deployment time. I’ve made this part of the deployment code public here: https://github.com/convect-ml/convect/. These artifacts are then loaded and executed in an AWS Lambda function upon invocation by an API Gateway endpoint.

I’ve talked with 50+ data scientists at small to medium size companies (2-300 employees) and many have identified deployment as a pain point in their workflows. I’ve also spoken with a few data scientists who have indicated that this would help save time on their after-work/weekend side projects.

I’m sharing this now on Show HN because I’d love for people to try out Convect and to hear about how you use it or how I can improve it to make it useful. I’ve also put together a gallery of examples for training and deploying models to make it easy to quickly get started (https://convect.readme.io/docs) and provided example endpoints that you can use query or even build ML-powered apps on top of (https://app.convect.ml/#/demo). Find out more at https://convect.ml. Thanks for having a look!


> 2) all the variables that are in scope in the Python session at deployment time.

My eyebrows raised in surprise at this point! Obviously there's a tradeoff between convenience ("Oooh it's working! great let me deploy right now") and reproducibility ("Hmm I need to re-deploy what my teammate was working on because <reason>, let me go and find their notebook... ah here it is, oh weird why do I have this undefined variable") and this is hard on the convenience side.

Was it surprising to you too that you'd have users who wanted this location on the continuum?


You’re absolutely right that this product is heavy on the convenience side of the tradeoff.

> Was it surprising to you too that you'd have users who wanted this location on the continuum?

I’m still working on the “have users” piece, but there were signals for wanting more simplicity/convenience in conversations that I had with data scientists. I also felt as a data scientist that existing deployment solutions sometimes felt over-engineered for flexibility and reproducibility, at the cost of convenience. I'm exploring the other end of the continuum with Convect -- how opinionated can we get with ML workflows to simplify and still be useful?


Looks cool. But what is the ConvectApp exactly? It's blocked on my work network.


Thanks! app.convect.ml is the ML model admin page.

The workflow looks somewhat like this —

Within a notebook, you can: 1) train a model, 2) deploy a model, and 3) get a shareable serverless API endpoint.

Outside the notebook, app.convect.ml provides a UI for editing, sharing, or deleting deployed models as well as getting the API key for deploying models.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: