# CropGene: a software package for the analysis of genomic and transcriptomic data of agricultural plants

**Authors:** A.Yu. Pronozin, D.I. Karetnikov, N.A. Shmakov, M.E. Bocharnikova, S.D. Afonnikova, D.A. Afonnikov, N.A. Kolchanov

PMC · DOI: 10.18699/vjgb-25-35 · Vavilov Journal of Genetics and Breeding · 2025-04-01

## TL;DR

CropGene is a software package designed to analyze genomic and transcriptomic data of agricultural plants, making plant breeding more efficient.

## Contribution

CropGene introduces new methods for analyzing long non-coding RNAs, protein domains, and genome-wide associations in agricultural plants.

## Key findings

- CropGene identified genetic markers explaining up to 50% of seed color variability.
- More than 100,000 new long non-coding RNAs were discovered using CropGene.
- Potential genes for potato variety development and orthogroups with A2 phospholipase-like domains were identified.

## Abstract

Currently, the breeding of agricultural plants is increasingly based on the use of molecular biological data on genetic sequences, which makes it possible to significantly accelerate the breeding process, create new plant varieties through genomic editing. These data have a large volume, variety and require a large amount of resources, both labor and computing, to analyze the costs. Data analysis of such volume and complexity can be effective only when using modern bioinformatics methods, which include algorithms for identifying genes, predicting their function, and evaluating the effect of mutation on plant phenotype. Such an analysis has recently become impossible without the use of integrated software systems that solve problems of different levels by executing computational pipelines. The paper describes the CropGene software package developed for the comprehensive analysis of genomic and transcriptomic data of agricultural plants. CropGene includes several blocks of bioinformatic analysis, such as analysis of gene variations, assembly of genomes and transcriptomes, as well as annotation of genes and proteins. CropGene implements new methods for analyzing long non-coding RNAs, protein domains, searching and analyzing polymorphisms, and genome-wide association research. CropGene has a user-friendly interface and supports working with various types of data, which greatly simplifies its use for researchers who do not have deep knowledge in the field of bioinformatics. The paper provides examples of the use of CropGene for the analysis of agricultural organisms such as Solanum tuberosum and Zea mays. With CropGene, genetic markers have been identified that explain up to 50 % of the variability in seed color parameters; potential genes that may become promising material for producing potato varieties; more than 100 thousand new long non-coding RNAs. Orthogroups were also found, the domain structure of which shows a marked similarity with the domain architecture of characteristic secreted A2 phospholipases. Thus, CropGene is an important tool for scientists and practitioners working in the field of agrobiotechnology and plant genetics.

## Linked entities

- **Species:** Solanum tuberosum (taxon 4113), Zea mays (taxon 4577)

## Full-text entities

- **Species:** Zea mays (maize, species) [taxon 4577], Solanum tuberosum (potatoes, species) [taxon 4113]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12011622/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12011622/full.md

---
Source: https://tomesphere.com/paper/PMC12011622