The performance of Multi-Modal AI Open AI’s GPT-3 model was widely considered to be state of the art when it was introduced in May 2020. GPT-3 sets a new standard for deep learning by generating imperceptible text from human-generated prose.
It’s amazing how much a year can change everything. Chinese researchers released Tuesday Wu Dao, a giant artificial intelligence that appears capable of nearly everything GPT-3 does, and so much more.
The first thing to note is that Wu Dao is massive. A full ten times more parameters than GPT-3 (essentially, the model’s self-selected coefficients) and 150 times more than Google’s Switch Transformer have been used to train the model.
A mix of Google experts and BAAI researchers developed an open-source learning framework when training a model on such a massive number of parameters quickly – Wu Dao 2.0 was released in only three months after Wu Dao 1.0.
A system, referred to as , was developed. MOE Fast. By using PyTorch, the PyTorch-based system was able to train a model both on conventional GPUs and supercomputers.
As a result, FastMoE could function despite off-the-shelf hardware, such as a supercomputer cluster, at higher speeds than Google’s TPU-powered system.
A whole host of capabilities come along with that computing power. The majority of deep learning models focus on one thing: composing copy, creating fakes, recognizing faces, and winning.
According to their report, Wu Dao is multimodal, which is similar to Facebook’s anti-hate speech AI and to Google’s recently released similar to MUM issued in the BAAI lab’s annual conference on Tuesday.
In addition to generating texts in traditional Chinese, the model can also generate photorealistic images based on natural language descriptions, as well as write essays, poems, and couplets. He also demonstrated his ability to drive virtual sculptures (using Microsoft spinoff GeoIs) and predict the 3D structures of proteins like alpha fold.
During the conference on Tuesday, BAAI President Dr. Zhang Hongjiang said, “The path to artificial general intelligence is big models and big computers.” This is the foundation of the future of artificial intelligence. Data and computation in the mega with era power and mega models, we can transform data to power AI applications in the future.”