Skip to main content
Glama
8b-is
by 8b-is
README_RustyFlow.md5.31 kB
# RustyFlow: A Pure Rust Machine Learning Library RustyFlow is an experimental machine learning library written entirely in Rust, designed for building and training neural network models, starting with a Transformer-based language model. The project aims to provide a clear, efficient, and robust foundation for deep learning in Rust, leveraging `ndarray` for numerical operations and `wgpu` for cross-platform GPU acceleration. ## Features - **Transformer Architecture**: Implements key components of the Transformer model, including Multi-Head Attention, Feed-Forward Networks, Layer Normalization, and Positional Embeddings. - **Automatic Differentiation (Autograd)**: A custom autograd engine enables automatic gradient computation for training neural networks. - **Optimizers**: Stochastic Gradient Descent (SGD) is currently implemented. - **Data Handling**: Utilities for vocabulary building, tokenization, and batching. - **Cross-Platform GPU Acceleration**: Leverages `wgpu` (WebGPU) for GPU-accelerated matrix multiplications, supporting Metal (macOS/iOS), Vulkan (Linux/Android), and DirectX 12 (Windows). - **Model Serialization**: Save and load trained models for persistence and inference. - **Interactive Chat**: Engage with trained language models in a conversational interface. - **Profiling**: Basic profiling tools to analyze performance bottlenecks (CPU vs. GPU). ## Getting Started ### Prerequisites - **Rust and Cargo**: If you don't have Rust installed, you can get it from [rustup.rs](https://rustup.rs/). ```bash curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh ``` - **`git`**: For cloning the repository. ### 1. Clone the Repository ```bash git clone https://github.com/cekim7/RustyFlow.git cd RustyFlow ``` ### 2. Build the Project Compile the project in release mode. This will create the `cli` executable in `target/release/`. ```bash ./run.sh build ``` ### 3. Setup Configuration and Download Data A `config.env` file is used to manage hyperparameters and settings. You can create a default one and download necessary datasets: ```bash # Create a default config.env if it doesn't exist ./run.sh setup # Download the TinyShakespeare dataset (required for tinyshakespeare training) ./run.sh get-data tinyshakespeare ``` The `wikitext-2` dataset is included directly in the repository under `data/wikitext-2`. For more details on data handling, see [`docs/data_handling.md`](docs/data_handling.md). ## Usage RustyFlow provides separate scripts for training and chatting on CPU or GPU, all configurable via `config.env`. ### Configuration (`config.env`) The `config.env` file (created by `./run.sh setup`) allows you to customize: - `DATASET`: `tinyshakespeare`, `wikitext-2`, `short`, or a path to a custom text file. - `SEQ_LEN`: Sequence length for training and context for chat. - `NUM_EPOCHS`, `BATCH_SIZE`, `LEARNING_RATE`: Training hyperparameters. - `EMBED_DIM`, `NUM_HEADS`, `NUM_LAYERS`: Model architecture. - `MODEL_PATH`: Base path for saving/loading models. (Scripts will append `-cpu.bin` or `-gpu.bin`). - `TEMPERATURE`, `TOP_P`: Sampling parameters for chat. You can create multiple `.env` files (e.g., `wikitext.env`) and pass them as an argument to the scripts: ```bash # Example: Use a custom config file for training ./train_gpu.sh wikitext.env ``` ### Training Training scripts will automatically save the trained model to the path specified in `config.env` (with `-cpu.bin` or `-gpu.bin` appended) and log performance metrics to `training_log.txt`. #### Train on CPU ```bash ./train_cpu.sh ``` #### Train on GPU (Apple Silicon, NVIDIA, AMD, Intel) ```bash ./train_gpu.sh ``` **Note on GPU Acceleration**: While GPU acceleration is implemented, achieving significant speedups requires offloading more operations than just matrix multiplication. The `wgpu` backend provides cross-platform compatibility but introduces overhead due to data transfers between CPU and GPU for each operation. For a detailed explanation, refer to [`docs/acceleration.md`](docs/acceleration.md). ### Chatting After training a model, you can interact with it using the chat scripts. #### Chat on CPU ```bash ./chat_cpu.sh ``` #### Chat on GPU ```bash ./chat_gpu.sh ``` When running `chat_cpu.sh` or `chat_gpu.sh` without a trained model, it will list available models in the `models/` directory. ### Other Commands The `run.sh` script also offers other utility commands: - `./run.sh build`: Compiles the project in release mode. - `./run.sh check`: Checks the project for errors without building. - `./run.sh test`: Runs all unit and integration tests. - `./run.sh get-data [dataset]`: Downloads specified datasets (e.g., `tinyshakespeare`). - `./run.sh help`: Displays usage information. ## Documentation - **Data Handling**: [`docs/data_handling.md`](docs/data_handling.md) - **GPU Acceleration**: [`docs/acceleration.md`](docs/acceleration.md) - **Model Evaluation**: [`docs/evaluation.md`](docs/evaluation.md) ## Contributing RustyFlow is a work in progress. Contributions, feedback, and suggestions are welcome! Please open an issue or submit a pull request on GitHub. ## License This project is licensed under the MIT License - see the LICENSE file for details.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/8b-is/smart-tree'

If you have feedback or need assistance with the MCP directory API, please join our Discord server