International Journal of Hydrogen Energy, Vol.44, No.33, 17813-17822, 2019
In-silico-mining of small sequence repeats in hydrogenase maturation subunits of E. coli, clostridium, and Rhodobacter
Presently, the world is facing energy crisis so there is a strong need of in silico mining and analysis of potential bacteria leading to renewable energy generation i.e. bio hydrogen production using bio-wastes as substrate. Hydrogenase (Hyd), a key enzyme mediates hydrogen evolution, is supported with multiple maturation proteins like HypA, HypB, HypC, HypD, HypE, HypF, HycE, and Formate dehydrogenase in hydrogen-producing bacteria. Simple sequence repeats (SSRs) are the large source of genetic markers and it contains tandemly repeated short units of 1-6 bp. It is the first time to find out the SSRs (Simple Sequence Repeats) in the hydrogenase maturation protein supported genes (hypABCDEF), hycE and formate dehydrogenase (fdhF), as well as the organization of these gene(s) supported operon, depending on its presence in different hydrogen-producing bacteria i.e. in E. coli, Clostridium and Rhodobacter. Four operons i.e. eco-b2726 (hypABCDE), eco-b2713 (hypF), eco-b2724 (hycE) and eco-b4079 (fdhF) in E. coli; 4 operons i.e. CBO0432 (hypA), cbo-CBO0431 (hypB), cbo-CBO1837 (hypED), and cbo-CBO1837 (hydA) in Clostridium; and 2 operons i.e. rsp- RSP_0503 (hypABCDE) and rsp- RSP_0491 (hypF) in Rhodobacter were used in the current study. In this paper, we have performed an intensive investigation to find out the SSRs in all the above subunits, and it was concluded that dinucleotides were the most frequent repeat class followed by trinucleotides, tetranucleotides, pentanucleotides, and hexanucleotides. A total of 467, 332, and 387 small repeats were detected in entire maturation protein-assisted genes of E. coli, Clostridium, and Rhodobacter respectively. The length of the screened repeats was short owing to the small length of the maturation protein supported genes and none of the long length SSRs was identified in this whole study. It has been concluded that the heptamer (GCCGATC)(2) in fdhF of E. coli, as well as the hexamer (GGGAGT)(2) in hypA of Clostridium, may serve as an ideal marker. We have employed SSRIT (Simple Sequence Repeat Identification Tool) software for isolation of the SSRs. The resulted SSR markers may facilitate in the genetic identification of different potential unknown hydrogen-producing bacteria in a short time. (C) 2019 Hydrogen Energy Publications LLC. Published by Elsevier Ltd. All rights reserved.