- May 2019
Gene set enrichment analysis is a leading computational method for placing newly acquired high content data in the context of prior biological knowledge (1). Gene set enrichment analysis tools such as DAVID (2), GenePattern (3), WebGestalt (4), AmiGO (5), Babelomics (6), GeneVestigator (7), GOEAST (8), Panther (9) and Enrichr (10,11) have been widely used, demonstrating the utility and relevance of this approach for many diverse studies.
Is GSEA really the leading method now? Not NEA?
- Mar 2019
- Jan 2019
Please check out Software Carpentry as well. This is a great intro that covers not just programming and data analysis (R/Python), but a lot of crucial stuff that every bioinformatician should know but usually is not covered in courses, such as Unix shell Git and version control Unit testing SQL and databases Data management and provenance I also like A Quick Guide to Organizing A Computational Biology Project for organizational techniques that usually have to be learned by experience
In my opinion if you can get enrolled into a degree program for systems biology then that would be best. However, if you are just exploring the field on your own I would recommend going through these resources.Video lectures by Uri AlonVideo Lectures by Jeff GorePrinciples of Synthetic Biology (at edx)Coursera specialization on systems biology.If you are looking for mathematical intensive start with first 2 and if you are looking for biologically intensive begin with last 2. Either way go through all 4 of them as they provide diverse perspective on systems biology which is very important. As you will move through these materials all the necessary supplementary information like books, papers and softwares will be informed within these materials itself.Hope this helps!
Apart from the free and open source KNIME Analytics Platform, KNIME also has commercial offerings. The KNIME server provides a platform for sharing workflows. It has a web interface and is connected to a KNIME instance for executing workflows remotely on demand or according to a schedule. Also commercially available are the Big Data Extensions and the KNIME Spark executor.
- Apr 2017
RepeatMasker was developed using TRF version 4.0.4
Downloaded v4.0.9, Linux command line (legacy GLIBC, <= 2.12)
For RMBlast ( NCBI Blast modified for use with RepeatMasker/RepeatModeler )
Used RMBlast pre-compiled binaries provided by NCBI;
Previous Release: 2.2.28
Download Pre-compiled Package: Download both the BLAST+ and RMBlast packages from NCBI for your platform:
Extract both tarballs, and symlink or copy rmblastn RMBlast to
blast/bin/location, so that all of binaries are in once-place.
I'm the developer of pyGeno. Here's a little script that does just that for the Gene TPST2, by using segment trees
recipe for merging transcripts of a gene into a single compound transcript
- Jan 2014
Additional broader impacts will emerge from analyses of the whooping crane dataset. Through collaborations with endangered species biologists in the US Geological Survey, these analyses will have direct relevance to specific management actions for the whooping crane, such as the timing, group size, and composition of crane reintroductions and potentially their training with ultra-light aircraft.
Broader impact for management of endangered species
The project will develop an analysis package in the open-source language R and complement it with a step-by-step hands-on manual to make tools available to a broad, international user community that includes academics, scientists working for governments and non-governmental organizations, and professionals directly engaged in conservation practice and land management. The software package will be made publicly available under http://www.clfs.umd.edu/biology/faganlab/movement/.
Output of the project:
- analysis package written in R
- step-by-step hands-on manual
- make tools available to a broad, international community
- software made publicly available
Question: What software license will be used? The Apache software license is potentially a good choice here because it is a strong open source license supported by a wide range of communities with few obligations or barriers to access/use which supports the goal of a broad international audience.
Question: Will the data be made available under a license, as well? Maybe a CC license of some sort?
These species represent not only different types of movement (on land, in air, in water) but also different types of relocation data (from visual observations of individually marked animals to GPS relocations to relocations obtained from networked sensor arrays).
Types of relocation data:
- visual observations
- networked sensor arrays
For example, by statistically analyzing the interrelationships of relocation data among individuals, it will be possible to distinguish and quantify population-level movement patterns such as migration, range residency, and nomadism.
Quantifying movement patterns at the population-level:
- range residency
Are there examples of this kind of data product at scale?
This project will develop new and innovative data management and analysis tools focusing on the interrelationship of multiple moving individuals. These include measures that calculate 1) realized mobility (quantifying the relationship of individual to population ranges), 2) population dispersion (quantifying the spatial relationship among individuals), 3) movement coordination (quantifying the coordination of movements among individuals), and 4) intra-individual concordance (quantifying the spatial relationship of relocations of individuals over time). These innovative ways of treating animal movement data will allow researchers to investigate a broad range of new research questions.
1) Realized mobility: Relationship of individual to population ranges. 2) Population dispersion: Spatial relationship among individuals. 3) Movement coordination: Coordination of movements among individuals. 4) Intra-individual concordance: Spatial relationship of relocations of individuals over time.
but scientists' understanding of the emergent spatial dynamics at the population level has not kept pace, in large part due to an absence of appropriate tools for data handling and statistical analysis.
Tools gap needs to be filled to improve understanding of emergent spatial dynamics.
A grant is awarded to University of Maryland, College Park to develop informatics tools that allow scientists and conservation managers to use animal relocation and tracking data to study movement processes at the population level.
Movement Dynamics Homepage: http://www.clfs.umd.edu/biology/faganlab/movement/
- movement types
- NSF grant
- movement dynamics
- spatial ecology
- movement patterns
- relocation data sources
News Thomas Mueller and Bill Fagan receive a new NSF Bioinformatics grant Collaborators Peter Leimgruber Smithsonian Institution Volker Grimm Centre for Environmental Research - UFZ, Leipzig Kirk A. Olson University of Massachusetts Todd K. Fuller University of Massachusetts George B. Schaller Wildlife Conservation Society Nuria Selva Institute of Nature Conservation, Krakow
- Peter Leimgruber, Smithsonian Institution
- Volker Grimm, Centre for Environmental Research - UFZ, Leipzig
- Kirk A. Olson, University of Massachusetts
- Todd K. Fuller, University of Massachusetts
- George B. Schaller, Wildlife Conservation Society
- Nuria Selva, Institute of Nature Conservation, Krakow