Discover Academic Papers
We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs...
We introduce a new state space model architecture called Mamba that combines structured state spaces with selective mechanisms. Mamba achieves linear time complexity and low memory usage through a simple selective mechanism...
Chain of thought prompting has emerged as a powerful technique for improving the reasoning capabilities of large language models. In this paper, we show that the effectiveness of chain of thought can be further enhanced through self-consistency...
We present Gemini, a family of multimodal models that exhibit remarkable capabilities across image, audio, video, and text understanding. The models achieve state-of-the-art performance across a wide spectrum of benchmarks...
We present DreamCraft3D, a novel approach for high-quality 3D content creation from text descriptions. Our method introduces a hierarchical generation framework that leverages a bootstrapped diffusion prior...
We present PaLM 2, a large language model that builds on the success of PaLM and incorporates various architectural and training improvements. These include improved quality, reduced latency, and responsible AI considerations...
We introduce Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts...
We present Llama 2, a collection of pretrained and fine-tuned large language models ranging in scale from 7 billion to 70 billion parameters. This is the next generation of our open source Llama family of large language models...
An efficient and effective method for adapting large language models to specific tasks or domains while maintaining their general capabilities. LoRA reduces the number of trainable parameters significantly...
We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs...
We introduce a new state space model architecture called Mamba that combines structured state spaces with selective mechanisms. Mamba achieves linear time complexity and low memory usage through a simple selective mechanism...
Chain of thought prompting has emerged as a powerful technique for improving the reasoning capabilities of large language models. In this paper, we show that the effectiveness of chain of thought can be further enhanced through self-consistency...
We present Gemini, a family of multimodal models that exhibit remarkable capabilities across image, audio, video, and text understanding. The models achieve state-of-the-art performance across a wide spectrum of benchmarks...
We present DreamCraft3D, a novel approach for high-quality 3D content creation from text descriptions. Our method introduces a hierarchical generation framework that leverages a bootstrapped diffusion prior...
We present PaLM 2, a large language model that builds on the success of PaLM and incorporates various architectural and training improvements. These include improved quality, reduced latency, and responsible AI considerations...
We introduce Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts...
We present Llama 2, a collection of pretrained and fine-tuned large language models ranging in scale from 7 billion to 70 billion parameters. This is the next generation of our open source Llama family of large language models...
An efficient and effective method for adapting large language models to specific tasks or domains while maintaining their general capabilities. LoRA reduces the number of trainable parameters significantly...