Development Environment Setup

Complete guide for setting up a Celline development environment, from basic installation to advanced debugging tools.

Prerequisites

System Requirements

Component	Minimum Version	Recommended
Python	3.9+	3.11+
Node.js	16+	20+
R	4.0+	4.3+
Git	2.20+	Latest
Operating System	Linux/macOS	Ubuntu 22.04 / macOS 13+

Development Tools

Essential tools for Celline development:

      # Package managers
pip install uv  # Fast Python package manager
npm install -g pnpm  # Fast Node.js package manager

# Code quality tools
pip install ruff black isort mypy
pip install pre-commit

# Testing frameworks
pip install pytest pytest-cov pytest-xdist
pip install coverage

# Documentation tools
pip install sphinx sphinx-rtd-theme

Repository Setup

1. Fork and Clone

      # Fork the repository on GitHub first
git clone https://github.com/YOUR_USERNAME/celline.git
cd celline

# Add upstream remote
git remote add upstream https://github.com/YUYA556223/celline.git

# Verify remotes
git remote -v

2. Branch Strategy

      # Create development branch
git checkout -b develop
git push -u origin develop

# Feature branch workflow
git checkout develop
git pull upstream develop
git checkout -b feature/my-new-feature

3. Development Installation

Using UV (Recommended)

      # Install with all development dependencies
uv sync --all-extras --dev

# Activate virtual environment
source .venv/bin/activate  # Linux/macOS
# or
.venv\Scripts\activate.bat  # Windows

Using Pip

      # Create virtual environment
python -m venv celline-dev
source celline-dev/bin/activate

# Install in development mode
pip install -e ".[dev,test,docs]"

# Install additional development tools
pip install -r requirements-dev.txt

Development Configuration

1. Pre-commit Hooks

Set up automated code quality checks:

      # Install pre-commit
pip install pre-commit

# Install hooks
pre-commit install

# Test hooks
pre-commit run --all-files

Create .pre-commit-config.yaml:

      repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.4.0
    hooks:
      - id: trailing-whitespace
      - id: end-of-file-fixer
      - id: check-yaml
      - id: check-added-large-files

  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.1.6
    hooks:
      - id: ruff
        args: [--fix, --exit-non-zero-on-fix]
      - id: ruff-format

  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: v1.7.1
    hooks:
      - id: mypy
        additional_dependencies: [types-all]

2. Editor Configuration

VS Code Settings

Create .vscode/settings.json:

      {
  "python.defaultInterpreterPath": "./.venv/bin/python",
  "python.formatting.provider": "black",
  "python.linting.enabled": true,
  "python.linting.ruffEnabled": true,
  "python.linting.mypyEnabled": true,
  "python.testing.pytestEnabled": true,
  "python.testing.pytestArgs": ["tests/"],
  "files.associations": {
    "*.toml": "toml"
  },
  "editor.formatOnSave": true,
  "editor.codeActionsOnSave": {
    "source.organizeImports": true
  }
}

PyCharm Configuration

Interpreter: Set to .venv/bin/python
Code Style: Import Black configuration
Inspections: Enable MyPy and Ruff
Test Runner: Configure pytest

3. Environment Variables

Create .env file for development:

      # Development settings
CELLINE_DEBUG=true
CELLINE_LOG_LEVEL=DEBUG
CELLINE_TEST_MODE=true

# Database settings
CELLINE_DB_PATH=./test_db
CELLINE_CACHE_DIR=./.cache

# API settings
CELLINE_API_HOST=localhost
CELLINE_API_PORT=8000

# R configuration
R_HOME=/usr/lib/R
R_LIBS_USER=./R_libs

# Testing settings
PYTEST_CURRENT_TEST=true

Development Workflow

1. Code Style and Formatting

Ruff Configuration

Create ruff.toml:

      [tool.ruff]
target-version = "py39"
line-length = 88
select = [
    "E",  # pycodestyle errors
    "W",  # pycodestyle warnings
    "F",  # pyflakes
    "I",  # isort
    "B",  # flake8-bugbear
    "C4", # flake8-comprehensions
    "UP", # pyupgrade
]
ignore = [
    "E501",  # line too long (handled by black)
    "B008",  # do not perform function calls in argument defaults
    "C901",  # too complex
]

[tool.ruff.per-file-ignores]
"__init__.py" = ["F401"]
"tests/**/*" = ["S101", "ARG", "FBT"]

[tool.ruff.isort]
known-first-party = ["celline"]

MyPy Configuration

Create mypy.ini:

      [mypy]
python_version = 3.9
warn_return_any = True
warn_unused_configs = True
disallow_untyped_defs = True
disallow_incomplete_defs = True
check_untyped_defs = True
disallow_untyped_decorators = True
no_implicit_optional = True
warn_redundant_casts = True
warn_unused_ignores = True
warn_no_return = True
warn_unreachable = True
strict_equality = True

[mypy-tests.*]
disallow_untyped_defs = False

[mypy-celline.external.*]
ignore_missing_imports = True

2. Testing Setup

Pytest Configuration

Create pytest.ini:

      [tool:pytest]
testpaths = tests
python_files = test_*.py *_test.py
python_classes = Test*
python_functions = test_*
addopts = 
    --strict-markers
    --strict-config
    --verbose
    --cov=celline
    --cov-report=term-missing
    --cov-report=html
    --cov-report=xml
markers =
    slow: marks tests as slow (deselect with '-m "not slow"')
    integration: marks tests as integration tests
    unit: marks tests as unit tests
    requires_r: marks tests that require R
    requires_cellranger: marks tests that require Cell Ranger

Test Structure

      tests/
├── conftest.py              # Shared fixtures
├── unit/                    # Unit tests
│   ├── test_functions/
│   ├── test_database/
│   └── test_utils/
├── integration/             # Integration tests
│   ├── test_workflows/
│   └── test_api/
├── fixtures/                # Test data
│   ├── sample_data/
│   └── mock_responses/
└── performance/             # Performance tests
    └── test_benchmarks.py

Common Test Fixtures

Create tests/conftest.py:

      import pytest
import tempfile
import shutil
from pathlib import Path
from celline import Project
from celline.config import Config

@pytest.fixture
def temp_dir():
    """Create temporary directory"""
    temp_path = tempfile.mkdtemp()
    yield temp_path
    shutil.rmtree(temp_path, ignore_errors=True)

@pytest.fixture
def test_project(temp_dir):
    """Create test project"""
    # Create project structure
    project_path = Path(temp_dir) / "test_project"
    project_path.mkdir()
    
    # Create configuration files
    (project_path / "setting.toml").write_text("""
[project]
name = "test_project"
version = "1.0.0"

[execution]
system = "multithreading"
nthread = 1

[R]
r_path = "/usr/bin/R"
""")
    
    (project_path / "samples.toml").write_text("""
GSM123456 = "Test Sample 1"
GSM789012 = "Test Sample 2"
""")
    
    # Create directories
    (project_path / "data").mkdir()
    (project_path / "resources").mkdir()
    (project_path / "results").mkdir()
    
    return Project(str(project_path), "test_project")

@pytest.fixture
def mock_h5_data():
    """Mock HDF5 data for testing"""
    import numpy as np
    from unittest.mock import Mock
    
    mock_adata = Mock()
    mock_adata.n_obs = 1000
    mock_adata.n_vars = 2000
    mock_adata.X = Mock()
    mock_adata.X.toarray.return_value = np.random.rand(1000, 2000)
    mock_adata.var_names = [f"GENE_{i}" for i in range(2000)]
    mock_adata.obs_names = [f"CELL_{i}" for i in range(1000)]
    
    return mock_adata

@pytest.fixture(scope="session")
def celline_test_config():
    """Test-specific configuration"""
    original_config = {}
    
    # Store original config
    for attr in dir(Config):
        if not attr.startswith('_'):
            original_config[attr] = getattr(Config, attr)
    
    # Set test config
    Config.DEBUG = True
    Config.TEST_MODE = True
    
    yield Config
    
    # Restore original config
    for attr, value in original_config.items():
        setattr(Config, attr, value)

3. Documentation Setup

Sphinx Configuration

Create docs/conf.py:

      import os
import sys
sys.path.insert(0, os.path.abspath('../src'))

project = 'Celline'
copyright = '2024, Celline Contributors'
author = 'Celline Contributors'

extensions = [
    'sphinx.ext.autodoc',
    'sphinx.ext.viewcode',
    'sphinx.ext.napoleon',
    'sphinx.ext.intersphinx',
    'myst_parser',
]

templates_path = ['_templates']
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']

html_theme = 'sphinx_rtd_theme'
html_static_path = ['_static']

# Napoleon settings
napoleon_google_docstring = True
napoleon_numpy_docstring = True
napoleon_include_init_with_doc = False

# Intersphinx mapping
intersphinx_mapping = {
    'python': ('https://docs.python.org/3/', None),
    'numpy': ('https://numpy.org/doc/stable/', None),
    'pandas': ('https://pandas.pydata.org/docs/', None),
}

Build Documentation

      # Install documentation dependencies
pip install sphinx sphinx-rtd-theme myst-parser

# Build documentation
cd docs
make html

# Live reload during development
pip install sphinx-autobuild
sphinx-autobuild . _build/html

Debugging Tools

1. Debug Configuration

Python Debugger

      # Using pdb for debugging
import pdb

def problematic_function():
    data = load_data()
    pdb.set_trace()  # Debugger breakpoint
    result = process_data(data)
    return result

# Using ipdb (enhanced debugger)
pip install ipdb
import ipdb; ipdb.set_trace()

VS Code Debug Configuration

Create .vscode/launch.json:

      {
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "integratedTerminal",
            "justMyCode": false
        },
        {
            "name": "Python: Celline CLI",
            "type": "python",
            "request": "launch",
            "module": "celline.cli.main",
            "args": ["run", "info"],
            "console": "integratedTerminal",
            "cwd": "${workspaceFolder}/test_project"
        },
        {
            "name": "Python: Pytest",
            "type": "python",
            "request": "launch",
            "module": "pytest",
            "args": ["tests/unit/", "-v"],
            "console": "integratedTerminal"
        }
    ]
}

2. Logging Configuration

Development Logging

      # celline/log/dev_logger.py
import logging
import sys
from pathlib import Path
from rich.logging import RichHandler

def setup_dev_logging(level=logging.DEBUG):
    """Setup development logging with rich formatting"""
    
    # Create logs directory
    log_dir = Path("logs")
    log_dir.mkdir(exist_ok=True)
    
    # Root logger configuration
    root_logger = logging.getLogger()
    root_logger.setLevel(level)
    
    # Clear existing handlers
    root_logger.handlers.clear()
    
    # Rich console handler
    console_handler = RichHandler(
        rich_tracebacks=True,
        show_path=True,
        show_time=True
    )
    console_handler.setLevel(level)
    
    # File handler
    file_handler = logging.FileHandler(
        log_dir / "celline_dev.log",
        mode='a'
    )
    file_handler.setLevel(logging.DEBUG)
    
    # Formatters
    console_format = "%(message)s"
    file_format = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
    
    console_handler.setFormatter(logging.Formatter(console_format))
    file_handler.setFormatter(logging.Formatter(file_format))
    
    # Add handlers
    root_logger.addHandler(console_handler)
    root_logger.addHandler(file_handler)
    
    # Configure specific loggers
    celline_logger = logging.getLogger("celline")
    celline_logger.setLevel(level)
    
    # Suppress verbose external libraries
    logging.getLogger("requests").setLevel(logging.WARNING)
    logging.getLogger("urllib3").setLevel(logging.WARNING)
    logging.getLogger("matplotlib").setLevel(logging.WARNING)

# Use in development
if __name__ == "__main__":
    setup_dev_logging()

3. Performance Profiling

Function Profiling

      # celline/dev/profiling.py
import cProfile
import pstats
import io
from functools import wraps
import time

def profile_function(sort_by='cumulative', lines_to_print=20):
    """Decorator for profiling functions"""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            profiler = cProfile.Profile()
            profiler.enable()
            
            start_time = time.time()
            try:
                result = func(*args, **kwargs)
            finally:
                end_time = time.time()
                profiler.disable()
                
                # Create string buffer for stats
                s = io.StringIO()
                ps = pstats.Stats(profiler, stream=s)
                ps.sort_stats(sort_by)
                ps.print_stats(lines_to_print)
                
                print(f"\n=== Profile for {func.__name__} ===")
                print(f"Total time: {end_time - start_time:.4f} seconds")
                print(s.getvalue())
                
            return result
        return wrapper
    return decorator

# Usage example
@profile_function(sort_by='tottime', lines_to_print=10)
def expensive_function():
    # Your expensive operation
    pass

Memory Profiling

      # Memory usage monitoring
import psutil
import os
from contextlib import contextmanager

@contextmanager
def memory_profiler(description=""):
    """Context manager for memory profiling"""
    process = psutil.Process(os.getpid())
    
    # Initial memory
    initial_memory = process.memory_info()
    print(f"=== Memory Profile: {description} ===")
    print(f"Initial memory: {initial_memory.rss / 1024 / 1024:.2f} MB")
    
    try:
        yield process
    finally:
        # Final memory
        final_memory = process.memory_info()
        memory_diff = (final_memory.rss - initial_memory.rss) / 1024 / 1024
        
        print(f"Final memory: {final_memory.rss / 1024 / 1024:.2f} MB")
        print(f"Memory difference: {memory_diff:+.2f} MB")
        print(f"Peak memory: {process.memory_info().rss / 1024 / 1024:.2f} MB")

# Usage
with memory_profiler("data processing"):
    result = process_large_dataset(data)

Development Scripts

Build and Test Script

Create scripts/dev.py:

      #!/usr/bin/env python3
"""Development automation script"""

import subprocess
import sys
import argparse
from pathlib import Path

def run_command(cmd, description=""):
    """Run shell command with error handling"""
    print(f"\n{'='*50}")
    print(f"Running: {description or cmd}")
    print(f"{'='*50}")
    
    result = subprocess.run(cmd, shell=True)
    if result.returncode != 0:
        print(f"❌ Command failed: {cmd}")
        sys.exit(1)
    print(f"✅ Command succeeded: {description or cmd}")

def format_code():
    """Format code with ruff and black"""
    run_command("ruff format src/ tests/", "Format code with ruff")
    run_command("ruff check src/ tests/ --fix", "Fix ruff issues")

def type_check():
    """Run type checking with mypy"""
    run_command("mypy src/celline/", "Type checking with mypy")

def run_tests(test_type="all"):
    """Run tests"""
    if test_type == "unit":
        cmd = "pytest tests/unit/ -v"
    elif test_type == "integration":
        cmd = "pytest tests/integration/ -v"
    elif test_type == "fast":
        cmd = "pytest tests/ -v -m 'not slow'"
    else:
        cmd = "pytest tests/ -v"
    
    run_command(cmd, f"Running {test_type} tests")

def build_docs():
    """Build documentation"""
    run_command("cd docs && make clean && make html", "Build documentation")

def lint_all():
    """Run all linting tools"""
    run_command("ruff check src/ tests/", "Lint with ruff")
    run_command("mypy src/celline/", "Type check with mypy")

def full_check():
    """Run full development check"""
    format_code()
    lint_all()
    run_tests("fast")

def main():
    parser = argparse.ArgumentParser(description="Development automation")
    parser.add_argument("command", choices=[
        "format", "lint", "typecheck", "test", "test-unit", 
        "test-integration", "test-fast", "docs", "check"
    ])
    
    args = parser.parse_args()
    
    if args.command == "format":
        format_code()
    elif args.command == "lint":
        lint_all()
    elif args.command == "typecheck":
        type_check()
    elif args.command == "test":
        run_tests("all")
    elif args.command == "test-unit":
        run_tests("unit")
    elif args.command == "test-integration":
        run_tests("integration")
    elif args.command == "test-fast":
        run_tests("fast")
    elif args.command == "docs":
        build_docs()
    elif args.command == "check":
        full_check()

if __name__ == "__main__":
    main()

Make it executable:

      chmod +x scripts/dev.py

# Usage examples
python scripts/dev.py format
python scripts/dev.py test-fast
python scripts/dev.py check

IDE Integration

VS Code Extensions

Recommended extensions for Celline development:

      {
  "recommendations": [
    "ms-python.python",
    "ms-python.mypy-type-checker",
    "charliermarsh.ruff",
    "ms-python.black-formatter",
    "ms-toolsai.jupyter",
    "redhat.vscode-yaml",
    "tamasfe.even-better-toml",
    "eamodio.gitlens",
    "github.vscode-pull-request-github",
    "ms-vscode.vscode-json"
  ]
}

PyCharm Setup

Project Structure: Mark src as sources root
Python Interpreter: Set to virtual environment
Code Style: Import from .editorconfig
Run Configurations: Set up for pytest and CLI
File Watchers: Configure for ruff and mypy

Troubleshooting

Common Development Issues

1. Import Errors

      # Fix PYTHONPATH issues
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent.parent / "src"))

2. Test Environment Issues

      # Clean test environment
rm -rf .pytest_cache/
rm -rf .mypy_cache/
rm -rf __pycache__/
find . -name "*.pyc" -delete

# Reset virtual environment
deactivate
rm -rf .venv/
uv sync --all-extras --dev

3. R Integration Issues

      # Check R installation
R --version
which R

# Install required R packages
R -e "install.packages(c('Seurat', 'scPred', 'harmony'))"

# Set R_HOME if needed
export R_HOME=/usr/lib/R

4. Node.js Frontend Issues

      # Clean Node.js environment
rm -rf node_modules/
rm package-lock.json
npm cache clean --force
npm install

This comprehensive setup guide provides everything needed to start developing with Celline efficiently and maintain high code quality throughout the development process.