# Integrative annotation scores of variants for impact on RNA binding protein activities

**Authors:** Jingqi Duan, Audrey P Gasch, Sündüz Keleş

PMC · DOI: 10.1093/bioinformatics/btae181 · 2024-04-18

## TL;DR

This paper introduces INCA, a new method that uses RNA binding protein data to better understand how genetic variants affect RNA binding protein activities.

## Contribution

The novel contribution is the development of INCA, which integrates ENCODE RBP data with computational approaches to improve variant scoring.

## Key findings

- INCA provides specificity beyond generic scoring for RNA binding protein binding disruption.
- It can augment scoring for 46.2% of candidate variants and their linkage-disequilibrium partners.
- INCA is implemented in R and available for use.

## Abstract

The ENCODE project generated a large collection of eCLIP-seq RNA binding protein (RBP) profiling data with accompanying RNA-seq transcriptomes of shRNA knockdown of RBPs. These data could have utility in understanding the functional impact of genetic variants, however their potential has not been fully exploited. We implement INCA (Integrative annotation scores of variants for impact on RBP activities) as a multi-step genetic variant scoring approach that leverages the ENCODE RBP data together with ClinVar and integrates multiple computational approaches to aggregate evidence.

INCA evaluates variant impacts on RBP activities by leveraging genotypic differences in cell lines used for eCLIP-seq. We show that INCA provides critical specificity, beyond generic scoring for RBP binding disruption, for candidate variants and their linkage-disequilibrium partners. As a result, it can, on average, augment scoring of 46.2% of the candidate variants beyond generic scoring for RBP binding disruption and aid in variant prioritization for follow-up analysis.

INCA is implemented in R and is available at https://github.com/keleslab/INCA.

## Linked entities

- **Proteins:** RENBP (renin binding protein)

## Full-text entities

- **Genes:** RBMS3 (RNA binding motif single stranded interacting protein 3) [NCBI Gene 27303], HNRNPK (heterogeneous nuclear ribonucleoprotein K) [NCBI Gene 3190] {aka AUKS, CSBP, HNRPK, TUNP}, CARD17P (caspase recruitment domain family member 17, pseudogene) [NCBI Gene 440068] {aka CARD17, INCA}
- **Diseases:** liver cancer (MESH:D006528)
- **Chemicals:** triglycerides (MESH:D014280), lipid (MESH:D008055), SNV (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Mutations:** rs1057868, rs3815455
- **Cell lines:** K562 — Homo sapiens (Human), Blast phase chronic myelogenous leukemia, BCR-ABL1 positive, Cancer cell line (CVCL_0004), HepG2 — Homo sapiens (Human), Hepatoblastoma, Cancer cell line (CVCL_0027), S2 — Drosophila melanogaster (Fruit fly), Spontaneously immortalized cell line (CVCL_Z232)

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC11042904/full.md

---
Source: https://tomesphere.com/paper/PMC11042904