key: cord-0000945-affnaoni authors: Barnes, David W.; Mattingly, Carolyn J.; Parton, Angela; Dowell, Lori M.; Bayne, Christopher J.; Forrest, John N. title: Marine Organism Cell Biology and Regulatory Sequence Discoveryin Comparative Functional Genomics date: 2005-11-30 journal: Cytotechnology DOI: 10.1007/s10616-005-1719-5 sha: ee690dbd15ef27255a975bce6bc8432b9f9886d4 doc_id: 945 cord_uid: affnaoni The use of bioinformatics to integrate phenotypic and genomic data from mammalian models is well established as a means of understanding human biology and disease. Beyond direct biomedical applications of these approaches in predicting structure–function relationships between coding sequences and protein activities, comparative studies also promote understanding of molecular evolution and the relationship between genomic sequence and morphological and physiological specialization. Recently recognized is the potential of comparative studies to identify functionally significant regulatory regions and to generate experimentally testable hypotheses that contribute to understanding mechanisms that regulate gene expression, including transcriptional activity, alternative splicing and transcript stability. Functional tests of hypotheses generated by computational approaches require experimentally tractable in vitro systems, including cell cultures. Comparative sequence analysis strategies that use genomic sequences from a variety of evolutionarily diverse organisms are critical for identifying conserved regulatory motifs in the 5′-upstream, 3′-downstream and introns of genes. Genomic sequences and gene orthologues in the first aquatic vertebrate and protovertebrate organisms to be fully sequenced (Fugu rubripes, Ciona intestinalis, Tetraodon nigroviridis, Danio rerio) as well as in the elasmobranchs, spiny dogfish shark (Squalus acanthias) and little skate (Raja erinacea), and marine invertebrate models such as the sea urchin (Strongylocentrotus purpuratus) are valuable in the prediction of putative genomic regulatory regions. Cell cultures have been derived for these and other model species. Data and tools resulting from these kinds of studies will contribute to understanding transcriptional regulation of biomedically important genes and provide new avenues for medical therapeutics and disease prevention. the species and provided additional insights about genomic evolution. Another area of interest that stands to benefit greatly from genomic data is that of gene regulation. Understanding in this field is fragmented and, as with any scientific discipline, the concepts and questions that can be conceived and addressed experimentally are dependent on available technological approaches. Mechanisms by which expression of a single gene is regulated can be extremely complicated. Multiple phosphorylation-or ligand-dependent nuclear receptors that homo-or heterodimerize may be required to achieve activity. Each of these receptors may have different activation specificity or duration, even when acting via the same regulatory DNA sequence such as classical proximal promoter elements. These receptors may also work in combination with other transcription factors that function at sites more distal from the proximal promoter or in introns. Alternatively spliced transcripts represent another complex aspect of gene expression regulation that is influenced by extracellular and intracellular signaling but is not well understood (Stamm et al. 2005) . Furthermore, individual genes often are part of a broader, coordinately regulated network of genes that function to elicit a set of cellular responses (Wagner 1999) . Through such mechanisms, ligands may, for instance, induce their own metabolism or export, a process that further complicates understanding of gene regulation and that also has critical implications for models of pharmacokinetics and drug efficacy. Experimental identification of functional genomic sequences depends heavily on cell culture and other techniques of in vitro cell biology. Traditionally, identification of gene regulatory regions has been limited by the labor-intensiveness of the requisite strategies. Generally only regions close to the transcriptional start sites have been experimentally tractable for detailed examination, despite evidence that important regulatory regions exist more than 10 kb upstream or downstream from the coding region or in introns of genes (Rowntree et al. 2001) . Identification of specific functional sequences through the generation of deletion constructs is limited in the sequence size that can be analyzed and is restricted to examination of single genes. Transfection of cells with reporter constructs containing putative proximal promoters may elicit strong activation when treated with receptor-specific ligands in culture, while in vivo studies may yield inconsistent results, suggesting that these receptors and transcription factors are a subset of a larger complex of regulators of gene expression. Techniques such as DNASE I hypersensitivity studies and gel shifts are excellent methods for testing functionality of putative regulatory elements; however, they are not efficient for screening candidate sequences. By contrast, computational analyses allow the examination of a significantly larger region when predicting conserved regulatory regions and signals. Computational techniques lend themselves to the identification of patterns or clusters of regulatory motifs and prediction of coregulated genes, and generate targeted predictions of candidate regulatory sequences and signaling molecules that can then be directly tested in functional and mutagenesis studies (Hughes et al. 2000; Loots et al. 2000; Pennacchio and Rubin 2003; Ovcharenko et al. 2005) . Understanding causative relationships between specific regulatory elements and expression patterns would greatly enhance the ability to predict disease-associated genes (Pennacchio and Rubin 2003) . However, even in computational approaches the ability to identify and predict the regulatory functions of non-coding sequences has been limited (Pennacchio and Rubin 2003) . Pairwise comparisons have helped to predict functionally conserved regions, however the statistical accuracy of these predictions is increased when more than two sequences are used (Dubchak et al. 2000) . Several studies suggest that comparative analysis of multiple evolutionarily diverse organisms facilitate the prediction of functionally important non-coding regions (Dubchak et al. 2000; Matys et al. 2003; Thomas et al. 2003) . Comparisons of genome sequences from evolutionarily diverse organisms also elucidate regulatory regions that are specific to a particular species or group of species. Recent additions of the African tree frog (Xenopus tropicalis), chicken (Gallus gallus) and dog (Canis familiaris) to the sequencing pipeline will provide important complementary perspectives for tetrapods, but data from more divergent vertebrate genomes is still needed. Different rates of molecular evolution, including gene duplication, underscore the importance of the examination of genomic sequences from an evolutionary range of organisms for comparative sequence analyses. When such an approach is taken, the contrast of divergent sequences differentiates between functionally conserved regions and generally conserved regions that simply reflect a lack of divergence time (Dubchak et al. 2000) . Using sequences from evolutionarily diverse organisms may provide the necessary divergence to identify functionally conserved regions in genes that have evolved slowly. The increasing availability of genomic data for non-mammalian organisms and similarities to the human genome underscore the value of these organisms as models of a variety of diseases in humans (Aparicio et al. 2002; Dehal et al. 2002; Ballatori et al. 2003) . The identification of conserved coding and regulatory regions is enhanced by including divergent sequences in comparative studies (Thomas et al. 2003 ) because these sequences provide more stringent filters for detecting conserved, and presumably functionally important elements (Dubchak et al. 2000; Thomas et al. 2003; Ahituv et al. 2004) . Consequently, sequences from evolutionarily distant marine vertebrates and protovertebrates are being used in comparative studies with increasing frequency. The pufferfish (Fugu rubripes) genome sequencing project supported this approach as it led to the discovery of nearly 1000 human genes not previously described (Aparicio et al. 2002) . Sequence comparisons of the Hoxb-4 gene in mouse and Fugu identified novel regulatory elements that directed subsets of the full Hoxb-4 expression pattern in transgenic mice (Aparicio et al. 1995; Amores et al. 2004) . In a recent comparative analysis of the HoxA cluster in human, horned shark and zebrafish (Chiu et al. 2002) , extensive conservation of non-coding sequence motifs was found between the human and shark sequences, whereas zebrafish sequences exhibited significant loss of conservation. The majority of newly identified regulatory elements for this cluster of genes were identical to known binding sites for regulatory proteins as defined in the transcription factor database, TRANSFAC (http://www.biobase.de; Matys et al. 2003) , demonstrating the accuracy of this approach (Matys et al. 2003; Santini et al. 2003) . Special physiological attributes exhibited by many evolutionarily diverse aquatic organisms have led to the increased appreciation of their importance as models of human disease. These characteristics have been exploited experimentally to further understanding of human immunology, genomics, stem cell and cancer biology, pharmacology, toxicology and neurobiology. Historically, marine vertebrates provided critical insights into fundamental mechanisms of physiological processes, and the value of these organisms has not been diminished by the advent of molecular approaches. Membrane transporters that are the sites of action of diuretic drugs were first cloned from specialized organs in marine species (Gamba et al. 1993; Xu et al. 1994) . Mutagenesis studies in bony fish generated a spectrum of biologically relevant and distinct phenotypes (Naruse et al. 2004; Walter et al. 2004 ). Large-scale genetic screens produced more than 500 zebrafish mutants, many with phenotypes similar to human disorders (Dooley and Zon 2000) . Immunogenetic studies in carp, salmonids and other species are contributing to the growing database of marine vertebrate genomics Fujiki et al. 2001 Fujiki et al. , 2003 Tomana et al. 2002; Ishikawa et al. 2004) . Although increasing numbers of vertebrate genomes are being sequenced, there are still stretches of evolutionary history without representation (Thomas and Touchman 2002) . Among aquatic organisms, teleosts are represented by significant EST and genomic sequencing, such as those in zebrafish and pufferfish. Until recently, chondrichthyes, which include elasmobranchs (sharks, rays and skates) were the only major line of gnathostomes, or jawed vertebrates, for which there are no major initiatives to generate genomic sequences. Sequence data from elasmobranchs have provided unique insights into conserved functional domains of genes associated with human liver function, multidrug resistance, cystic fibrosis, G protein coupled receptors, natriuretic peptide receptors, and other biomedically relevant genes (Valentich and Forrest 1991; Henson et al. 1997; Aller et al. 1999; Silva et al. 1999; Greger et al. 1999; Waldegger et al. 1999; Ke et al. 2002; Wang et al. 2002; Yang et al. 2002; Cai et al. 2001 Cai et al. , 2003 Mattingly et al. 2004b) . In March, 2005 the National Human Genome Research Institute of the National Institutes of Health (NIH), USA, announced that it will fund the whole genome sequencing of Raja erinacea. In the announcement of the Skate Genome Sequencing Initiative (http://www.genome.gov/ 13014443), NIH states: 'The skate (related to many species of shark and cartilaginous fish) was chosen because it belongs to the first group of primitive vertebrates that developed jaws, an important step in vertebrate evolution. Other innovations in this group of animals include an adaptive immune system similar to that of humans, a closed and pressurized circulatory system, and myelination of the nervous system. Understanding these systems of the skate at a genetic level will help scientists identify the minimum set of genes that create a nervous system or develop a jaw, possibly illustrating how these systems have evolved in humans, and how they sometimes go wrong.' Chondrichthyes (cartilaginous) fish appeared approximately 450 million years ago. Elasmobranchs comprise most chondrichthyan organisms. They exhibit fundamental vertebrate characteristics, including a recombinatorial immune system (Hinds and Litman 1986; Adelman et al. 2004) , and are also the oldest existing vertebrates with circulatory system-related signaling molecules and receptors such as platelet-derived growth factor and adenosine receptors. The specialized rectal gland of Squalus acanthias, the spiny dogfish shark, has greatly facilitated the study of cystic fibrosis, sodium and chloride secretion (Devor et al. 1995; Lehrich et al. 1995; Forrest 1996; Henson et al. 1997; Lehrich et al. 1998; Silva et al. 1999; Aller et al. 1999; Greger et al. 1999; Waldegger et al. 1999; Ke et al. 2002; Yang et al. 2002) . Unlike many primary cell cultures that dedifferentiate and lose transport polarity immediately after isolation, primary cultures of shark rectal gland tubular cells maintain fully differentiated function and expression of all known receptors, transporters, ion channels and signal transduction pathways in vitro (Valentich and Forrest 1991; Devor et al. 1995; Lehrich et al. 1995; Aller et al. 1999; Greger et al. 1999; Waldegger et al. 1999; Ke et al. 2002; Yang et al. 2002; Mattingly et al. 2004b) . A comparison of the properties of the cloned shark and human cystic fibrosis transmembrane regulator (CFTR) has provided insights into structural domains related to functional differences in the normal and mutant proteins (Marshall et al. 1991) . The spiny dogfish shark CFTR protein is 72% identical to the human ortholog and comparison of the human and shark CFTR sequences revealed conservation of five cyclic AMP-dependent kinase phosphorylation sites and three residues that, when mutated in the human protein, are associated with cystic fibrosis. The coding sequences and functions of a number of medically relevant genes are conserved in the spiny dogfish shark and little skate (Cai et al. 2001 (Cai et al. , 2003 Wang et al. 2002; Yang et al. 2002; Mattingly et al. 2004b ). Primary hepatocytes from little skate retain hepatobiliary polarity for at least 8 h and possibly up to several days in culture, offering particular advantages for studies of liverspecific functions . Genomic information applied to existing physiological data in these systems, along with the further development of in vitro cell culture systems, will allow the testing of molecular hypotheses and understanding of regulatory mechanisms that are directly applicable to human biology. Targeted sequencing of well-defined genomic regions generated from BAC clones has been extremely useful in providing supporting genomic information in species for which complete genome data are not available . In addition to constructing four-fold coverage bacterial artificial chromosome libraries from sperm DNA of dogfish shark and little skate (http://www.mdibl.org/research/ skategenome.shtml), an expressed sequence tag (EST) sequencing project is underway to substantially increase the availability of sequence data for these model organisms (Mattingly et al. 2004b ). Over 10,000 sequences are publicly available through the EST database at the National Center for Biotechnology Information (dbEST; Boguski et al. 1993 ) and http://www.mdibl.org/decypher. These data sets are updated as new sequences become available. One of the remarkable findings of the human genome sequencing project was the discovery that coding regions account for only 5% of the genome (Venter et al. 2001) . The remaining sequence consists of repetitive DNA ($40-45%) and extensive non-coding regions for which there is very little functional information. It is within these regions that significant regulatory information presumably is concealed. The major challenge in the postgenomic age is uncovering important functional regions within this non-coding DNA. The rapid development of sequence analysis software tools, increasing availability of genomic data, and recognition of the importance of comparing data from diverse organisms are allowing scientists to make fruitful inroads to understanding genomic structure and gene expression regulation. A brief summary of software tools that are valuable for identifying regulatory information from crossspecies comparative analyses follows. The University of California Santa Cruz Genomic Browser (http://genome.ucsc.edu/; Karolchik et al. 2003 ) is among the most popular databases for querying and retrieving genomic sequences of interest. It currently provides access to genomic assemblies from 23 organisms, including 10 vertebrates, 8 insects, 2 nematodes, a sea squirt, baker's yeast, and the SARS virus. Whether genomic regions are retrieved from an existing database or sequenced locally, there are an increasing number of options for analysis and annotation. Several computational tools allow identification of genes and exon boundaries in genomic sequences. GENSCAN (Burge and Karlin 1997) , TWINSCAN (Korf et al. 2001) , MZEF (Zhang 1997) , and Gene Recognition and Analysis Internet Link (GRAIL; Uberbacher and Mural 1991) are among the most popular. Most gene finding programs were optimized to predict genes in sequences from mammalian models; as a result accuracy is sometimes reduced when using sequences from more divergent organisms or nonvertebrates. These programs use different strategies and are based on current, but incomplete understanding of gene structure. Therefore, predictions from multiple programs should be combined computationally to enhance accuracy and confidence (Rogic et al. 2002) . Quality control strategies can be used to increase accuracy of and confidence in results from gene finding tools. First, the abundance of repetitive elements in genomic sequences can distort predictions of genes and exon locations. To counter this problem, masking genomic sequences is recommended, before gene analysis, using a program like RepeatMasker (A.F. A. Smit et al. 1996, unpublished) . The effectiveness of Repeat-Masker with marine and other evolutionarily divergent organisms is not yet clear because interspersed repeats are specific to a species or group of species. Sequences from such organisms should be evaluated under masked and unmasked conditions. Second, aligning ESTs or cDNAs with genomic sequences using programs like Spidey (http://www.ncbi.nlm.nih.gov/spidey/; Wheelan et al. 2001 ) refines predictions of exon and intron boundaries, promoter regions, and splice sites. This approach presents challenges for transcripts that are expressed at very low levels, have significant tissue-or age-specific requirements, or are from species for which minimal sequencing has been done (Schwartz et al. 2000) . A new, publicly available resource, the Comparative Toxicogenomics Database (CTD; http://ctd.mdibl.org; Mattingly et al. 2003 Mattingly et al. , 2004a , provides multiple alignment and phylogenetic analysis results with sequences from diverse organisms for biomedically significant genes and proteins. CTD provides access to data valuable for identifying homologous genomic sequences and confirming gene and gene feature predictions. Identification of gene features greatly improves interpretation of subsequent multiple sequence analysis results. Aligning multiple genomic sequences is becoming widely accepted as a powerful mechanism for identifying important functional regions such as regulatory elements. MultiPipMaker (http:// pipmaker.bx.psu.edu/pipmaker/; Schwartz et al. 2003 ) and mVISTA (http://gsd.lbl.gov/vista/index.shtml; Bray et al. 2003; Frazer et al. 2004 ) are two of the most popular web servers that conveniently combine alignment engines with visualization capabilities. MultiPipMaker, and a new server zPicture (http://zpicture.dcode.org/; Ovcharenko et al. 2004) , use the local alignment program BLASTZ as their alignment engine (Schwartz et al. 2000) ; VISTA uses AVID, a global alignment program (Bray et al. 2003; Frazer et al. 2004 ). Local alignment tools find high-scoring, short matching segments and extend these regions based on a scoring threshold. Local alignments may permit greater diversity between sequences by finding regional similarities. Segments of similarity need not be conserved in order or orientation. This feature may be advantageous for finding conserved transcription factor binding sites, which are very short and prone to reordering (Bray et al. 2003) . High similarity of short regions, however, does not necessarily imply homology (a gene derived from a common ancestral gene) and can lead to false implications of relatedness among sequences that are not homologs. By contrast, global alignments do require that the order and orientation of similar regions is conserved, because similar architecture is often observed in homologous sequences (Bray et al. 2003) . MultiPipMaker and VISTA provide visualization options for alignments that include percent similarity and usersubmitted annotations (e.g., exon locations, repetitive elements). It is important to note that local and global alignment tools are being refined so rapidly that it is becoming difficult to distinguish between them (Frazer et al. 2004) . Readers are referred to two recent reviews of alignment programs for detailed comparisons (Frazer et al. 2003; Pollard et al. 2004) . A major challenge to identifying transcription factor binding sites (TFBSs) is that they tend to be short, degenerate, and occur frequently throughout the genome. Analysis of a single sequence usually leads to an abundance of false positive predictions of TFBSs. Several programs have been developed to respond to this challenge based on two principles. First, functional regulatory elements are often conserved evolutionarily; therefore, identifying TFBSs that are conserved or aligned in multiple sequences may effectively filter false positive predictions (Ovcharenko et al. 2005) . Second, gene expression often results from coordinate activation of multiple, proximal regulatory elements; therefore identifying TFBSs in clusters, rather than isolation, may enhance confidence in the functionality of predicted TFBSs. These principles have been leveraged, albeit differently, by rVISTA (http:// gsd.lbl.gov/vista/index.shtml; Loots et al. 2002; Loots and Ovcharenko 2004) , which is a member of the VISTA suite of tools and is also integrated with zPicture, and the newly launched Mulan (http://mulan.dcode.org/; Ovcharenko et al. 2005) . Both servers use profiles of transcription factor binding sites from the TRANSFAC database (http://www.biobase.de; Matys et al. 2003) . Because TRANSFAC and other similar resources only contain information for known transcription factor binding sites, they are inherently incomplete. Furthermore, existing tools do not address other important regulatory features such as properties related to protein-protein interactions and chromatin structure, and clusters of binding sites that may have been reshuffled between organisms over evolutionary time (Loots et al. 2002) . Cell culture of marine genomic model species and experimental verification of predictions from comparative analysis of genomic sequences Using in vitro cell culture systems, the functional significance of conserved, putative regulatory sequences predicted through comparative computational analysis in, for instance, the 5¢upstream region of select genes can be tested experimentally. The availability of sufficient genomic information facilitates targeted studies to evaluate such potential functional regulatory regions. Comparative experimental studies can be designed employing cell lines derived from any species, though mammalian cell lines have thus far been favored (Mather and Barnes 1997; Barnes and Sato 2000) . Reporter constructs containing regulatory regions of select genes can be generated using up to 5 kb upstream of the transcriptional start site of the relevant genes; these are inserted upstream of a reporter gene such as that for an enhanced fluorescent protein. In the last decade, significant progress has been made in development of marine and freshwater organism cell lines with utility for genomic studies. A variety of zebrafish cell lines have been developed, some of which maintain a normal karyotype for extended periods in vitro (Barnes and Collodi 2005) . Zebrafish cells in culture can be transfected with plasmid DNA using adaptations of approaches common for mammalian cells in vitro, and transient transfection methods have identified expression of genes under control of a number of mammalian and fish promoters. In addition, cell cultures from pufferfish (genera Fugu and Tetraodon) provide a biological complement to the genomic libraries derived to study the molecular biology of these animals, allowing the extension of this model to experiments in functional genomics. Multipassage cell cultures have been established from embryo and adult tissues of species of both of these pufferfish genera (Barnes and Collodi 2005) . One of the Fugu rubripes cultures has been maintained for more than 200 population doublings, and flow cytometry showed that the relative amount of DNA present in cultured cells was approximately 15% of that in human cells, as predicted by biochemical analysis. Telomerase, an enzyme associated with indefinite proliferation in mammalian cell cultures, was easily detectable in these cells, suggesting that the cultures are capable of indefinite growth. Until recently, the lack of in vitro culture systems for elasmobranch models has been a major limitation for the use of these species in comparative functional genomics. Cold-water marine animals such as the little skate and dogfish shark are useful models for physiology and genomics, but homologous in vitro systems with which investigators can test hypotheses or confirm predictions at a molecular level are essential for widespread use of these models. Expression of elasmobranch genes in heterologous systems such as mammalian cell cultures or Xenopus oocytes may be compromised by differences in membrane lipid composition and missing or interfering accessory and messenger proteins. Heterologous systems are also not adaptable to mechanistic studies using genetically altered, dominant negative molecules. Another advantage derived from cell cultures from species such as shark and skate (that may be only seasonably available) is that they make biological material available yearround in any laboratory. Studies on regulation of gene expression in dogfish shark and little skate are beginning to benefit from the expanding genomic databases for these animals. Skate embryo-derived cell cultures also provide new avenues for elasmobranch research in embryology and organogenesis, toxicology, neurobiology, genome regulation and comparative stem cell biology. Cells have been cultured from the rectal gland, eye, brain, kidney and early embryo of Squalus acanthias. Medium is either LDF, developed for zebrafish cells (Barnes and Collodi 2005) or VCM (Valentich and Forrest 1991) , a urea-timethylamine oxide (TMAO)-containing medium that supports the short-term culture of shark rectal gland cells. In some cases cells are plated onto dishes pretreated with collagen. Medium is further supplemented with a variety of cell typespecific peptide growth factors, nutrients and purified proteins at a range of concentrations. These include insulin, transferrin, epidermal growth factor (EGF), basic fibroblast growth factor (FGF), transforming growth factor-beta, vitamins A and E, selenium, mercaptoethanol, dexamethasone, fetal calf serum, shark serum and shark yolk extract. For example shark brain cells can be grown in primary culture in VCM containing EGF, FGF, insulin, transferrin, selenium, a chemically-defined lipid supplement and vitamin E on a fibronectin substratum (Figure 1 ). Expression of genes from plasmids has been achieved in primary dogfish shark rectal gland cell cultures by lipofection using the CMV promoter to direct expression of the gene of interest ( Figure 2 ). Differentiated function also is maintained, as evidenced by expression of CFTR and vasoactive intestinal peptide receptor (VIPR) mRNA detected by reverse-transcription polymerase chain reaction (RT-PCR) assay (Figures 3 and 4) . Assay for telomerase on cultures of a variety of dogfish shark and little skate cell types showed that all cultures tested were positive, although specific activity varied almost 100-fold among different cell types. Assay for cell proliferation by in situ bromo-deoxyuridine (BrdU) incorporation also was carried out on cultures from shark brain, kidney and rectal gland. The assay involves an overnight incubation with BrdU, followed by an immunoassay identifying cells synthesizing DNA during the time of incubation and incorporating the nucleoside analogue. The results showed that cells in the cultures were synthesizing DNA. The most active synthesis was seen in cultures from early shark embryos. Medium was supplemented with insulin, tranferrin, selenous acid, EGF, FGF, Lglutamine, chemically defined lipids, non-essential amino acids, heat-inactivated fetal bovine serum (heat treated for 30 min at 56°C) and shark yolk extract. Conditioned medium from the cells was stimulatory for other shark cell cultures, including the shark rectal gland cells ( Figure 5) . A normalized c-DNA library has been made from these cells and attempts will be made to identify the cell type by extensive EST analysis. In addition to the scarcity of cell lines from marine vertebrates, a persistent absence of cell lines from non-arthropod invertebrates has stymied research on many valuable species useful in comparative genomics and a variety of other disciplines (Bayne 1998; Rinkevich 1999) . Selection of the echinoderm Strongylocentrotus purpuratus (purple sea urchin) as a model species for genomic sequencing has enhanced the need for cell lines from this species in particular. We recently have explored cell culture using both this species and the closely related Strongylocentrotus droebachi- ensis (green sea urchin). Cells from Polian vesicles and axial organ (Figures 6 and 7) appear to have proliferated in vitro for more than 8 months (Parton and Bayne 2004, 2005) . Other cultures yielded thraustochytrid protists that are common parasites of marine invertebrates worldwide (Rinkevich 1999) . Basal nutrient culture medium was LDF diluted with 4 volumes of filtered sea water (Kawamura and Fujiwara 1995) with antibiotics (penicillin 200 U/ml, streptomycin 200 lg/ ml, ampicillin 25 lg/ml), sodium bicarbonate at 0.18 g/l and HEPES at 15 mM. Culture vessels were held at 20°C in ambient air. Additives included fetal calf serum (heat inactivated) at 1-3%; insulin at 10 lg/ml; transferrin at 10 lg/ml; FGF at 50 ng/ml; EGF at 50 ng/ml; b-mercapto-etha-nol at 55 lM; chemically defined lipid supplement at 1 ll/ml; selenium at 10 nM, and L-glutamine at 200 lM. Transcriptional regulation involves complex interactions of diverse proteins, or transcription factors, with specific DNA sequences in the noncoding regions of target genes. Furthermore, cells respond to environmental stimuli and to developmental signals by altering expression of gene networks. The limited number of transcription factors suggests that few, if any of these proteins exert their activity exclusively on a single gene; rather, they bind to conserved sites in several genes to coordinate their expression (Wagner 1999; Pennacchio and Rubin 2003) . In situations in which there are no clinically acceptable inhibitors or other modulators of clinically relevant proteins, elucidating mechanisms by which these genes are regulated and identifying other coordinately regulated genes may reveal novel strategies by which disease processes may be disrupted or controlled. Comparative studies provide insight into the possible common ancestor of a gene, trace the accumulation of mutations over time and suggest selective pressures that influence the expression and functions of genes (Pennacchio and Rubin 2003) . Comparisons have provided useful predictions about which sequences are minimally essential for function as well as those that may be important on a species-specific level (Aparicio et al. 1995) . Comparative genomic computational approaches continue to identify conserved regions in non-coding sequences. The challenges will be to determine which sequences are functionally significant and to identify coordinately regulated genes and common regulatory pathways. Understanding such networks, the transcription factor binding sites and genes involved in disease states may reveal alternative points of intervention and contribute to a more predictive approach to molecular medicine. Significant progress has been made in the development of powerful sequence analysis tools. Although often optimized for sequences from more traditional model organisms, they offer great value for comparative studies that include evolutionarily divergent sequences. They should be used cautiously, however, with awareness of their limitations, which are largely a reflection of current scientific knowledge and understanding of genome structure. As the diversity of available sequence data continues to increase, it will drive refinements and development of new tools and provide important insights into the evolution and essential mechanisms of gene expression regulation. The natural antibody repertoire of sharks and humans recognizes the potential universe of antigens Exploiting human-fish genome comparisons for deciphering gene regulation Cloning, characterization, and functional expression of a CNP receptor regulating CFTR in the shark rectal gland Figure 7. Primary culture of cells from sea urchin (Strogylocentrotus) axial organ. Figure 6. Primary culture of cells from sea urchin (Strogylocentrotus) Polian vesicle Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes Detecting conserved regulatory elements with the model genome of the Japanese puffer fish, Fugu rubripes Exploiting genome data to understand the function, regulation, and evolutionary origins of toxicologically relevant genes. Environ Cell Culture Systems in Tissue Engineering Fish Cell Lines and Stem Cells in the Physiology of Fishes Invertebrate Cell Culture Considerations: Insects, Ticks, Shellfish, and Worms Immune-relevant (including acute phase) genes identified in the livers of rainbow trout, Oncorhynchus mykiss, by means of suppression subtractive hybridization dbESTdatabase for ''expressed sequence tags AVID: a global alignment program Prediction of complete gene structures in human genomic DNA Molecular characterization of a multidrug resistance-associated protein, Mrp2, from the little skate Bile salt export pump is highly conserved during vertebrate evolution and its expression is inhibited by PFIC type II mutations Molecular evolution of the HoxA cluster in the three major gnathostome lineages The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins cAMP-activated Cl-channels in primary cultures of spiny dogfish (Squalus acanthias) rectal gland Zebrafish: a model system for the study of human disease Active conservation of noncoding sequences revealed by three-way species comparisons Cellular and molecular biology of chloride secretion in the shark rectal gland: regulation by adenosine receptors Cross-species sequence comparisons: a review of methods and available resources VISTA: computational tools for comparative genomics Molecular cloning of carp (Cyprinus carpio) C-type lectin and pentraxin by use of suppression subtractive hybridisation Molecular cloning and characterization of rainbow trout (Oncorhynchus mykiss) CCAAT/enhancer binding protein beta Primary structure and functional expression of a cDNA encoding the thiazide-sensitive, electroneutral sodium-chloride cotransporter Experiments on isolated in vitro perfused rectal gland tubules of Squalus acanthias Confocal microscopic observation of cytoskeletal reorganizations in cultured shark rectal gland cells following treatment with hypotonic shock and high external K+ Major reorganization of immunoglobulin VH segmental elements during vertebrate evolution Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae Characterisation of a fourth immunoglobulin light chain isotype in the common carp The UCSC Genome Browser Database Establishment of cell lines from multipotent epithelial sheet in the budding tunicate, Polyandrocarpa misakiensis Intracellular accumulation of mercury enhances P450 CYP1A1 expression and Cl-currents in cultured shark rectal gland cells Integrating genomic homology into gene structure prediction Tyrosine phosphorylation is a novel pathway for regulation of chloride secretion in shark rectal gland Vasoactive intestinal peptide, forskolin, and genistein increase apical CFTR trafficking in the rectal gland of the spiny dogfish, Squalus acanthias. Acute regulation of CFTR trafficking in an intact epithelium Identification of a coordinate regulator of interleukins 4, 13, and 5 by crossspecies sequence comparisons rVISTA 2.0: evolutionary analysis of transcription factor binding sites rVista for comparative sequence-based discovery of functional transcription factor binding sites Identification and localization of a dogfish homolog of human cystic fibrosis transmembrane conductance regulator The Comparative Toxicogenomics Database (CTD). Environ Promoting comparative molecular studies in environmental health research: an overview of the comparative toxicogenomics database (CTD). Pharmacogenom Cell and molecular biology of marine elasmobranchs: Squalus acanthias and Raja erinacea TRANSFAC: transcriptional regulation, from patterns to profiles A medaka gene map: the trace of ancestral vertebrate proto-chromosomes revealed by comparative gene mapping Mulan: multiple-sequence local alignment and visualization for studying function and evolution zPicture: dynamic alignment and visualization tool for analyzing conservation profiles Comparative genomic tools and databases: providing insights into the human genome Benchmarking tools for the alignment of functional noncoding Cell cultures from marine invertebrates: obstacles, new approaches and recent developments Improving gene recognition accuracy by combining predictions from two gene-finding programs An element in intron 1 of the CFTR gene augments intestinal expression in vivo Evolutionary conservation of regulatory elements in vertebrate Hox gene clusters Multi-PipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences PipMaker -a web server for aligning two genomic DNA sequences Mode of activation of salt secretion by C-type natriuretic peptide in the shark rectal gland Function of alternative splicing Parallel construction of orthologous sequence-ready clone contig maps in multiple species Vertebrate genome sequencing: building a backbone for comparative genomics Comparative analyses of multi-species sequences from targeted genomic regions Characterization of immunoglobulin light chain isotypes in the common carp Locating protein-coding regions in human DNA sequences by a multiple sensorneural network approach Cl-secretion by cultured shark rectal gland cells. I. Transepithelial transport Genes regulated cooperatively by one or more transcription factors and their identification in whole eukaryotic genomes Molecular and functional characterization of s-KCNQ1 potassium channel from rectal gland of Squalus acanthias A microsatellite genetic linkage map for Xiphophorus The role of bile salt export pump mutations in progressive familial intrahepatic cholestasis type II Spidey: a tool for mRNA-to-genomic alignments Molecular cloning and functional expression of the bumetanide-sensitive Na-K-Cl cotransporter Cyclooxygenase cloning in dogfish shark, Squalus acanthias, and its role in rectal gland Cl secretion Identification of protein coding regions in the human genome by quadratic discriminant analysis