--- title: "Installation Guide" author: "Chen Yang" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Installation Guide} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( echo = TRUE, message = FALSE, warning = FALSE, eval = FALSE ) ``` # Installation Guide This guide provides detailed instructions for installing and configuring mLLMCelltype for cell type annotation in single-cell RNA sequencing data. ## System Requirements Before installing mLLMCelltype, ensure your system meets the following requirements: - **R version**: 4.0.0 or higher - **Memory**: At least 8GB RAM recommended (more for large datasets) - **Operating System**: Windows, macOS, or Linux - **Internet Connection**: Required for API calls to LLM providers ## Installing the R Package ### Installation from CRAN (Recommended) mLLMCelltype is now available on CRAN. You can install it directly using: ```{r} # Install from CRAN install.packages("mLLMCelltype") ``` This will install the stable version of mLLMCelltype with all required dependencies. ### Installation from GitHub (Development Version) To install the latest development version from GitHub: ```{r} # Install devtools if not already installed if (!requireNamespace("devtools", quietly = TRUE)) { install.packages("devtools") } # Install mLLMCelltype development version devtools::install_github("cafferychen777/mLLMCelltype", subdir = "R") ``` ### Installation from a Local Source If you have downloaded the source code or need to install from a local copy: ```{r} # Assuming the package is in the current working directory devtools::install_local("path/to/mLLMCelltype/R") ``` ## Dependencies mLLMCelltype depends on several R packages that will be automatically installed during the installation process. The main dependencies include: - **dplyr**: For data manipulation - **httr**: For API requests - **jsonlite**: For JSON parsing - **R6**: For object-oriented programming - **digest**: For caching mechanisms - **magrittr**: For pipe operations For visualization and integration with single-cell analysis workflows, the following packages are recommended but not required: - **Seurat**: For integration with Seurat objects - **ggplot2**: For visualization - **SCpubr**: For publication-ready visualizations ## API Keys Setup mLLMCelltype requires API keys to access different LLM providers. You will need to obtain API keys for at least one of the supported providers: ### Obtaining API Keys 1. **OpenAI (GPT-4o/4.1)** - Visit OpenAI Platform - Create an account or log in - Navigate to API keys section - Create a new API key 2. **Anthropic (Claude-3.7/3.5)** - Visit Anthropic Console - Create an account or log in - Generate an API key 3. **Google (Gemini-2.0/2.5)** - Visit [Google AI Studio](https://makersuite.google.com/) - Create a Google account or log in - Generate an API key 4. **Other Providers** - Similar processes apply for DeepSeek, Qwen, Zhipu, MiniMax, Stepfun, and Grok - Visit their respective websites to obtain API keys ### Setting Up API Keys There are three ways to set up your API keys: #### 1. Environment Variables Create a `.env` file in your project directory with your API keys: ``` # API Keys for different LLM models OPENAI_API_KEY=your-openai-key ANTHROPIC_API_KEY=your-anthropic-key GEMINI_API_KEY=your-gemini-key DEEPSEEK_API_KEY=your-deepseek-key QWEN_API_KEY=your-qwen-key ZHIPU_API_KEY=your-zhipu-key STEPFUN_API_KEY=your-stepfun-key MINIMAX_API_KEY=your-minimax-key GROK_API_KEY=your-grok-key OPENROUTER_API_KEY=your-openrouter-key ``` Then load the environment variables in your R script: ```{r} library(dotenv) dotenv::load_dot_env() ``` #### 2. Direct Specification in Function Calls You can directly provide API keys in function calls: ```{r} library(mLLMCelltype) results <- annotate_cell_types( input = your_marker_data, tissue_name = "human PBMC", model = "claude-sonnet-4-5-20250929", api_key = "your-anthropic-key", top_gene_count = 10 ) ``` #### 3. R Environment Variables Set API keys as R environment variables: ```{r} Sys.setenv(OPENAI_API_KEY = "your-openai-key") Sys.setenv(ANTHROPIC_API_KEY = "your-anthropic-key") # Set other API keys as needed ``` ## Verifying Installation To verify that mLLMCelltype is installed correctly and API keys are set up properly: ```{r} library(mLLMCelltype) # Check if the package is loaded correctly packageVersion("mLLMCelltype") # Verify API key setup for a specific provider api_key <- get_api_key("anthropic") if (!is.null(api_key) && api_key != "") { cat("Anthropic API key is set up correctly\n") } else { cat("Anthropic API key is not set up\n") } ``` ## Common Installation Issues ### Package Installation Failures If you encounter issues during installation: 1. **Check R version**: Ensure you're using R 4.0.0 or higher 2. **Update devtools**: Run `install.packages("devtools")` to ensure you have the latest version 3. **Check dependencies**: Some dependencies might require system libraries on Linux ### API Connection Issues If you encounter issues connecting to LLM APIs: 1. **Verify API keys**: Ensure your API keys are correct and have not expired 2. **Check internet connection**: Ensure you have a stable internet connection 3. **Proxy settings**: If you're behind a proxy, configure R to use your proxy settings ```{r} # Example of setting proxy for httr httr::set_config(httr::use_proxy(url = "proxy_url", port = proxy_port)) ``` ### Memory Limitations For large datasets, you might encounter memory issues: 1. **Increase R memory limit**: Use `memory.limit(size = 16000)` on Windows to increase available memory 2. **Process data in batches**: Consider processing large datasets in smaller batches ## Next Steps Now that you have installed mLLMCelltype, you can proceed to: - [Getting Started](https://cafferyang.com/mLLMCelltype/articles/getting-started.html): Learn the basics of using mLLMCelltype - [Usage Tutorial](https://cafferyang.com/mLLMCelltype/articles/usage-tutorial.html): Explore more advanced usage scenarios - [Visualization Guide](https://cafferyang.com/mLLMCelltype/articles/visualization-guide.html): Learn how to visualize your results If you encounter any issues not covered in this guide, please [open an issue](https://github.com/cafferychen777/mLLMCelltype/issues) on our GitHub repository.