许相儒 

 

 

论文题目:转录组技术平台建立及在树突状细胞和肝癌研究中的应用

 

 

作者简介:许相儒,男,1970年02月出生,1999年09月师从于上海第二医科大学陈竺教授,于2002年06月获博士学位。

                                       

 

 

  随着人类基因组测序工作的基本完成, 认识生命现象本质以及对人类疾病状

态的理解的层面已从基因组结构转向基因组功能。转录组研究是功能基因组的重要分支,也是连接基因组结构和功能基因组其他分支蛋白质组、结构基因组的一个桥梁和纽带,更是基因网络调控研究的主要基础和层面。本实验室的转录组研究开展则主要瞄准那些具有重要生理功能和重大医学研究价值、而且公共数据库信息量又稀少的器官、组织和细胞。并以研究项目带动技术平台的建立。本研究首先以脐血CD34+细胞诱生的树突状细胞(DC)和肝癌/癌旁(HCC/liver)组织为研究模型比较成功的构建了DC细胞PCR cDNA 文库和肝癌/癌旁组织经典cDNA文库,并进行大规模5’EST序列测定。两种方法构建的cDNA文库都具有较高的滴度(>1×107pfu/ml)和重组效率(>90%)及较长的平均插入片段长度(>1.5kb)。此外,5’EST聚类结果(即Cluster数目)也反映出了两种文库都具有一定的代表性。两种文库的差别体现在经典文库的库容量和插入片段稍大,而PCR文库的标本用量少且全长比例也稍高。共测定5’EST序列47,000多条,这些EST序列已经全部释至GenBank。然后,我们又以肝癌/癌旁(HCC/liver)组织为模型对自制的12.4K尼龙膜cDNAmicroarray进行了重复性及可靠性实验研究, 质量评估结果显示我们自制的尼龙膜cDNA  microarray具有较好的重现性(reproducibility),对于同一标本的重复检测相关系数已达到0.92-   0.93,说明这项技术所获得的结果可以被接受的。此外,为整合以上两种方法所产生的结果,我们还建立了一套转录组生物信息学分析的完整策略。当这些技术和策略被应用于DC细胞分化和肝癌发生研究时,我们获得了一批有意义的结果。在DC分化研究方面:7,650条EST序列整合为3,641个簇(clusters), 其中的1,455簇(4,919条序列)代表了已知基因。据此,我们首先得到了一张比较系统的DC细胞免疫生物学功能分子目录,从中可以清晰地发现DC细胞的趋化特性、抗原摄取能力、抗原的加工、递呈以及信号传递等生物学功能实现的分子基础。其次,在该文库中我们还克隆了53个全长cDNA,包括8个具有信号肽结构,其中之一已被证实具有分泌蛋白活性,2个具有典型的四穿膜结构域的CD20超家族成员和一个七穿膜G蛋白藕联受体等。另外,在与DC的源头脐血CD34+造血干/祖细胞的基因表达谱比对研究中发现, Ras-MAPK信号通路以及对细胞内Ca2+信号依赖性比较强的其它信号通路可能共同参与了DC分化的过程,而细胞骨架相关分子的明显改变似乎与DC细胞的状态和分化行为也有密切关系;在肝癌发生研究方面:通过整合来自肝癌/癌旁组织代表11,065基因簇(clusters)的5’ESTs序列、自制的和商业化肿瘤相关cDNA微阵列杂交所得数据,首先我们获得了肝癌/癌旁组织的基因表达目录,比较分析共获得2,253基因/ESTs作为肝癌/癌旁组织差异表达的候选基因/ESTs,然后,我们还从中挑选一批与肿瘤发生和肝细胞功能/分化相关的基因/ESTs用半定量反转录PCR在29对肝癌/癌旁组织中进行了验证。结果发现:许多参与细胞周期调控的基因,如细胞周期素(cyclins),细胞周期素依赖性激酶(CDKs)以及细胞周期的负调节因子等的表达在多数病例中显示失调。Wnt-β-catenin信号传导通路和DNA复制相关酶类的表达紊乱可能对肝癌的发生、发展也有贡献。此外,在调变的基因/ESTs中有两组具有潜在的临床诊断价值,其中一组与代谢相关,而另一组则可能代表了恶性肝细胞去分化的状态。特别是我们还尝试着整合并比较了肝癌中几个常见基因组失衡区域内基因转录水平的数据与基因组不平衡性的评估结果,发现转录组的调变与相应染色体区域等位基因杂合性缺失或者扩增之间有着良好的相关性,提示基因组的不平衡性(imbalance)是导致肝癌转录组异常的根本原因之一。

 

关键词: 转录组 表达序列标签 cDNA 微阵列 生物信息学 基因表达谱 树突状细胞 分化 肝癌

 

 

   Abstract

                            

    With the amazing achievements of human genome sequencing project, the way for understanding the essence of life phenomena and human diseases has been switched from genomic structure into genomic function. Transcriptomics, an important branch of functional genomics, is the joint between genomic structure and other branches of functional genomics, such as proteomics, structural genomics and the main basis for clarifying regulatory networks of genes. Our researches at transcriptome level aimed mainly at the organs, tissues and cells that involve in the important physiological functions and human diseases, and the transcriptome data from them was rarely available at the public domain. The program for set up transcriptome analysis platforms was also initiated in our lab. In this thesis, at first, we successfully constructed PCR cDNA library of the cord blood CD34+ cells induced dendritic cells(DC) and classical cDNA libraries of HCC/liver tissues, and then the large scale 5’EST sequencing were performed by randomly picking up clones from the above cDNA libraries. The cDNA libraries constructed based on two different methods were high quality with relatively high titer>5×107pfu/mland high percentage of recombinatant clones>90%, as well as the longer inserts>1.5 kb. Meanwhile, the results of 5’EST clustering indicated that both kinds of cDNA libraries are representative well. By comparing two kinds of libraries with each other, we found that the titer and the length of insertion in classical cDNA library are better, although there are far less the amount of sample for cDNA synthesis and slightly higher percentage of full-length in PCR cDNA library. Totally, we sequenced 47,000 5’ESTs and released them into the GenBank. Then, the HCC/liver tissues were used as the model to test the reproducibility and reliability of homemade 12.4K nylon membrame cDNA microarray. The result revealed that the homemade cDNA microarray has good reproducibility and it may produce the acceptive data since the correlation coefficient was reached 0.92-0.93 by replicated examination of the same sample. Moreover, in order to integrate the data from the above two different strategies, we employed bioinformatics analysis strategy for integrating transcriptome data. When the strategy was applied in DC differentiation and hepatocarcinogenesis study, we got a batch of meaningful results. In the study of DC differentiation, we integrated the 7,650 pieces of EST sequences into 3,641 clusters, of which 1,455 clusters account for the known genes. According to the known knowledge of the genes, a relatively systematic molecular catalog relating to DC immunobiologial function, such as chemoattration, antigen uptaking, antigen processing and presentation, and signaling, was obtained. Furthermore, 53 full-length cDNAs, of which 8 possess signal peptide structure and one of them has been verified to be the secreted protein, 2 CD20 superfamily members with the membrane-spanning four-domains and a new G-protein coupled receptor, were cloned from the DC cDNA library. In addition, when compared the gene expression profile of DC with that of its parental CD34+ hemotopoietic stem/progenitor cells, it was revealed that the Ras-MAPK pathway and other pathway that strongly depend on the cellular Ca2+ concentration are possibly work together during the DC differentiation, and the cytoskeleton-associated molecules with significantly changed expression level also seemed to be closely relate to the state maintenance and the differentiaion behavior of DC. In the hepatocarcinogenesis study , we report on a comprehensive characterization of gene expression profiles of hepatitis B virus-positive HCC through the generation of a large set of 5’-read expressed sequence tag (EST) clusters (11,065 in total) from HCC and noncancerous liver samples, which then were applied to a cDNA microarray system containing 12,393 genes/ESTs and to comparison with a public database. The commercial cDNA microar-ray, which contains 1,176 known genes related to oncogenesis, was used also for profiling gene expression. Integrated data from the above approaches identified 2,253 genes/ESTs as candidates with differential expression. A number of genes related to oncogenesis and hepatic function/differentiation were selected for further semiquantitative reverse transcriptase–PCR analysis in 29 paired HCC/liver samples. Many genes involved in cell cycle regulation such as cyclins, cyclin-dependent kinases, and cell cycle negative regulators were deregulated in most patients with HCC. Aberrant expression of the Wnt-β-catenin pathway and enzymes for DNA replication also could contribute to the patho-genesis of HCC. The alteration of transcription levels was noted in a large number of genes implicated in metabolism, whereas a profile change of others might represent a status of dedifferenti-ation of the malignant hepatocytes, both considered as potential markers of diagnostic value. Notably, the altered transcriptome profiles in HCC could be correlated to a number of chromosome regions with amplification or loss of heterozygosity, providing one of the underlying causes of the transcription anomaly of HCC.

 

Keywords: transcriptome, expressed sequence tag(EST), cDNA microarray, bioinformatics, gene expression profile, dendritic cell(DC) , differentiation , hepatocellular carcinoma(HCC) 

 回主页