News M
assachusetts Institute of Technology (MIT) researchers have been
working to reshape artificial intelligence (AI) training using StableRep, a system that leverages text-to-image models to generate diverse synthetic images for machine learning (ML). Te team, led by MIT
PhD student, Lijie Fan, set about developing AI training using synthetic images with StableRep’s unique approach centred around ‘multi-positive contrastive learning’, allowing the developers to surpass traditional ‘real-image’ training methods.
New, innovative AI training Te research on the technology shows how StableRep essentially allows for the generation of synthetic images through ultra-popular text-to- image models, such as Stable Diffusion – similar to creating worlds with words. Multiple images, spawned from identical text prompts as positive pairs, provide additional information during training, adding diversity and specifying to the vision system which images are alike and which are different. Lijie Fan. of the MIT
Computer Science and Artificial Intelligence Laboratory, and lead researcher on the work,
‘This shows that contrastive learning from massive amounts of synthetic image data can produce representations that outperform those learned from real data at scale’
For the latest vision industry news, visit
www.imveurope.com/news
MIT researchers enhance AI training with synthetic images
The new technology can help to address the challenges of data acquisition in machine learning and better train models
said: “We’re teaching the model to learn more about high-level concepts through context and variance, not just feeding it data. “When multiple images, all
generated from the same text, are all treated as depictions of the same underlying thing, the model dives deeper into the concepts behind the images, say the object, not just their pixels.” Te success of StableRep lies in fine-tuning the ‘guidance scale’ in the generative model, achieving a delicate balance between diversity and fidelity in synthetic images. Te technology can help to address the challenges of data acquisition in ML, and also move AI training beyond conventional methods, delving into high-level concepts through context and variance. Testing has shown that
4 IMAGING AND MACHINE VISION EUROPE DECEMBER 2023 / JANUARY 2024
the method proved effective, outperforming established models like SimCLR and CLIP on extensive datasets. Te addition of language supervision, seen in StableRep+, has shown high-level accuracy and efficiency compared with models trained on vast collections of real images.
Overcoming challenges with imagery and text Despite its successes, StableRep faces challenges, including the slow pace of image generation, potential biases in text prompts, and complexities in image attribution. Additionally, StableRep needs to first train the generative model on large-scale real data. Te researchers also acknowledge the initial reliance on real data, however recognise the potential to repurpose a
well-trained generative model for new tasks. In spite of the questions
over application, David Fleet, Google DeepMind researcher and Professor of Computer Science at the University of Toronto, said: “One dream of generative model learning has long been to be able to generate data useful for discriminative model training.” He added: “Tis paper provides compelling evidence, for the first time to my knowledge, that the dream is becoming a reality. Tey show that contrastive learning from massive amounts of synthetic image data can produce representations that outperform those learned from real data at scale, with the potential to improve myriad downstream vision tasks.”
@imveurope |
www.imveurope.com
MIT News
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32