Skip to content
View ccarloscr's full-sized avatar

Block or report ccarloscr

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ccarloscr/README.md

Carlos Camilleri-Robles, PhD

Bioinformatics Engineer | Barcelona, Spain

LinkedIn ORCID Email

I build reproducible, production-grade bioinformatics pipelines for multi-omics data, from raw NGS reads to documented, reusable analytical tools. Background in Nextflow DSL2, SLURM HPC, and multi-omics integration, with a PhD in Genetics and an MSc in Bioinformatics and Biostatistics.


🔬 What I build

  • NGS Pipelines | Nextflow DSL2 · SLURM · Apptainer/Singularity · Conda
  • Multi-Omics Analysis | RNA-seq · ATAC-seq · ChIP-seq · Hi-C · WES
  • Data Engineering | Python ETL · API integration · count matrix production
  • Statistical Methods | Differential analysis · empirical null models · FDR · ML
  • Reproducibility | Containerized workflows · SOP documentation · FAIR practices

🧬 Featured projects

chip-nf - End-to-end ChIP-seq pipeline

Nextflow DSL2 pipeline covering the full ChIP-seq workflow: QC → alignment → peak calling → differential analysis → annotation → visualization. Per-module Conda environments, SLURM and local execution profiles. Single command to go from raw reads to annotated differential peaks and MultiQC report.

Nextflow Bowtie2 MACS2 DESeq2 HOMER deepTools SLURM Conda


PyGDC-RNA-ETL - Clinical genomics ETL pipeline

Python ETL pipeline for large-scale extraction and integration of RNA-seq data and clinical metadata from the NCI Genomic Data Commons. Parallelized batched API downloads with auto-resume, AJCC stage normalization, somatic mutation annotation, and analysis-ready output for DE and ML workflows.

Python ETL GDC API pandas clinical metadata cancer genomics


parastar - Containerized batch RNA-seq alignment

STAR + GNU Parallel pipeline packaged as an Apptainer/Singularity image (distributed via Zenodo). Single config file, no script editing, dry-run mode, skip-completed logic for resumable HPC runs. Designed for batch processing at scale.

STAR GNU Parallel Apptainer Singularity SLURM HPC


loopstrength — Hi-C chromatin loop quantification

R/Python framework for quantifying chromatin loop strength changes between conditions. Implements size- and distance-matched random controls, empirical two-sided p-values, and BH-FDR correction. Used in a Science Advances publication.

R Python Hi-C Cooler empirical null 3D genomics


🛠️ Stack

Languages Python · R · Bash · Nextflow DSL2 · SQL
Workflow Nextflow DSL2 · SLURM HPC · Apptainer/Singularity · Conda/Mamba · Git
NGS Tools STAR · Bowtie2 · MACS2 · HOMER · DESeq2 · GATK · Cooler · samtools · deepTools · FastQC/MultiQC
Python libs pandas · numpy · requests · matplotlib · scikit-learn
R libs Tidyverse · Bioconductor · ggplot2 · Seurat
ML / Stats PCA · UMAP · random forest · k-means · empirical permutation · BH-FDR
Databases GEO · GDC/TCGA · Ensembl · UCSC · BioMart

📚: Selected publications

  • Llorens-Giralt, P., Camilleri-Robles, C., et al. 3D genome organization in tissue regeneration involves long-range chromatin loops. Science Advances (Accepted)
  • Camilleri-Robles, C., et al. (2024). Long non-coding RNAs involved in Drosophila development and regeneration. NAR Genomics and Bioinformatics
  • Camilleri-Robles, C., et al. (2024). A shift in chromatin binding of phosphorylated p38 precedes transcriptional changes upon oxidative stress. FEBS Letters

Full publication list: ORCID


Pinned Loading

  1. chip-nf chip-nf Public

    Nextflow-based pipeline for the end-to-end analysis of ChIP-seq datasets

    Nextflow

  2. PyGDC-RNA-ETL PyGDC-RNA-ETL Public

    PyGDC-RNA-ETL is a robust pipeline designed to automate the extraction, transformation, and integration of genomic data from the GDC Data Portal.

    Jupyter Notebook

  3. parastar parastar Public

    Batch RNA-seq alignment using STAR and GNU Parallel

    Shell

  4. loopstrength loopstrength Public

    Quantify chromatin loop strength changes between two Hi-C conditions

    Shell