Prévia do material em texto
IA gerativa Introdução • It’s easy to forget just how much you know about the world: you understand that it is made up of 3D environments, objects that move, collide, interact; people who walk, talk, and think; animals who graze, fly, run, or bark; monitors that display information encoded in language about the weather, who won a basketball game. • This tremendous amount of information is out there and to a large extent easily accessible—either in the physical world of atoms or the digital world of bits. The only tricky part is to develop models and algorithms that can analyze and understand this treasure trove of data. • Generative models are one of the most promising approaches towards this goal. To train a generative model we first collect a large amount of data in some domain (e.g., think millions of images, sentences, or sounds, etc.) and then train a model to generate data like it. IA Generativa • Ela emula a inteligência humana em tarefas como reconhecimento de imagem e processamento de linguagem natural. A IA generativa pode ser treinada em diversos domínios, reutilizando dados de treinamento para resolver novos problemas. Exemplos de uso incluem chatbots, criação de mídia e desenvolvimento de produtos. • Aplicações com o uso de IA Generativa demonstram a sua capacidade de aumentar a produtividade, economizando tempo e custo, e transformar as experiências de clientes, respondendo de forma mais natural e precisa. Ela também pode ser útil em tarefas criativas, desenvolvimento de softwares, extração de dados, marketing. • A IA Generativa usa modelos de machine learning, como modelos básicos (FMs) e grandes modelos de linguagem (LLMs), onde são treinados com um grande volume de dados. Os FMs realizam tarefas gerais com base nos dados generalizados e não rotulados recebidos. Usa-se os padrões aprendidos para prever o próximo elemento em uma sequência, uma palavra em um texto ou fazer modificações em uma imagem. Enquanto, LLMs são focados em tarefas de linguagem, gerando resumos, texto, conversas. Possuem milhões de parâmetros, dados que são recebidos da internet, que permitem aprender conceitos avançados e aplica-los em diversas situações. Modelos principais para treinamento • Discriminative modeling is used to classify existing data points (e.g., images of cats and guinea pigs into respective categories). It mostly belongs to supervised machine learning tasks. • Generative modeling tries to understand the dataset structure and generate similar examples (e.g., creating a realistic image of a guinea pig or a cat). It mostly belongs to unsupervised and semi- supervised machine learning tasks. https://www.altexsoft.com/blog/business/supervised-learning-use-cases-low-hanging-fruit-in-data-science-for-businesses/ Objetivo - IA gerativa • Suponha que usamos uma rede recém-inicializada para gerar 200 imagens. Como devemos ajustar os parâmetros da rede para fazê-la produzir amostras um pouco mais acreditáveis no futuro? Treinamento - IA gerativa • Generator (G): A generative model that creates new data. • Discriminator (D): A discriminative model that distinguishes between real data and data generated by the generator. • The generator and discriminator are initialized with random weights. • O processo de treinamento envolve alternar entre a atualização dos pesos do discriminador e do gerador. Discriminador & Gerador Generator Training • Generate Fake Data: The generator creates a new batch of fake data from random noise. • Discriminator Prediction on Fake Data: The discriminator evaluates the fake data, predicting whether it is real or fake. • Label Fake Data as Real: For the purpose of training the generator, the fake data is labeled as real (e.g., label 1). The goal is to trick the discriminator. • Compute Generator Loss: The generator computes its loss based on how well it managed to fool the discriminator. This is done by comparing the discriminator’s predictions on the fake data against the label “real.” • Update Generator: The generator’s weights are updated using gradient descent to minimize its loss. Discriminator Training • Input Real Data: The discriminator receives a batch of real data samples from the training dataset. • Input Fake Data: The generator creates a batch of fake data samples from random noise (often sampled from a normal distribution). • Label Real Data: The real data samples are labeled as real (e.g., label 1). • Label Fake Data: The fake data samples are labeled as fake (e.g., label 0). • Compute Discriminator Loss: The discriminator computes its loss by comparing its predictions on real and fake data against the true labels (real or fake). • Update Discriminator: The discriminator’s weights are updated using gradient descent to minimize its loss. Training • The process repeats, alternating between training the discriminator and the generator until the models reach a point where the generated data is indistinguishable from the real data. Training summary • Discriminator Phase: o Input: Real data and fake data. oObjective: Maximize the probability of correctly classifying real and fake data. oLoss Function: Cross-entropy loss. oUpdate: Backpropagation and gradient descent to adjust weights. • Generator Phase: o Input: Random noise. oObjective: Minimize the discriminator's ability to classify fake data as fake. oLoss Function: Cross-entropy loss, where fake data is labeled as real. oUpdate: Backpropagation and gradient descent to adjust weights. Fluxograma treinamento Flowchart treinamento References • https://openai.com/index/generative-models/ • https://www.altexsoft.com/blog/generative-ai/ • https://www.cmu.edu/intelligentbusiness/expertise/genai- principles.pdf • https://www.cmu.edu/intelligentbusiness/expertise/genai- principles.pdf • https://chatgpt.com/ • https://openai.com/index/dall-e-2/ https://openai.com/index/generative-models/ https://openai.com/index/generative-models/ https://openai.com/index/generative-models/ https://openai.com/index/generative-models/ https://www.altexsoft.com/blog/generative-ai/ https://www.altexsoft.com/blog/generative-ai/ https://www.altexsoft.com/blog/generative-ai/ https://www.altexsoft.com/blog/generative-ai/ https://www.cmu.edu/intelligentbusiness/expertise/genai-principles.pdf https://www.cmu.edu/intelligentbusiness/expertise/genai-principles.pdf https://www.cmu.edu/intelligentbusiness/expertise/genai-principles.pdf https://www.cmu.edu/intelligentbusiness/expertise/genai-principles.pdf https://www.cmu.edu/intelligentbusiness/expertise/genai-principles.pdf https://www.cmu.edu/intelligentbusiness/expertise/genai-principles.pdf https://www.cmu.edu/intelligentbusiness/expertise/genai-principles.pdf https://www.cmu.edu/intelligentbusiness/expertise/genai-principles.pdf https://chatgpt.com/ https://chatgpt.com/ Slide 1: IA gerativa Slide 2: Introdução Slide 3: IA Generativa Slide 4: Modelos principais para treinamento Slide 5: Objetivo - IA gerativa Slide 6: Treinamento - IA gerativa Slide 7: Discriminador & Gerador Slide 8: Generator Training Slide 9: Discriminator Training Slide 10: Training Slide 11: Training summary Slide 12: Fluxograma treinamento Slide 13: Flowchart treinamento Slide 14: References