Major Bugs/Errors are
shown in red
Major Additions are
shown in blue
Minor stuff or upgrades that won't likely have a big impact
are shown in black
Next Version (soon)
HOMER v4.7 (8/25/14)
- Many many small changes and updates
- In some cases strand specific read
counting seemed to randomly switch to unstranded read
counting when using annotatePeaks.pl - fixed.
- Default behavior for
annotatePeaks.pl was to count reads based on the average
peak size - now the default is "-size given"
(recommended to always use -size parameter)
- Modification to background annotation and update scripts
that links gene symbols/IDs to more relevant
RefSeq/Ensembl IDs. Before they would often link to
XM_###### which are less useful.
- Incorporated several tools for mCpG analysis from
methylC-Seq or BS-Seq data - documentation to come soon.
- Fixed issues with GTF parsing. By default it will
output each transcript. However, when running most
programs or parseGTF.pl you can specify "-gid" to
have the program output representative transcripts using
the gene_id instead of the transcripts_id.
- Changed default on -strand option for analyzeRepeats.pl
to 'both' because it is a safer default.
- Changed the way the accession number tables are created
in HOMER to prioritize the assignment of NM RefSeq numbers
as representative IDs for each gene instead of XM and NR
RefSeq numbers. This had caused some issues when
trying to match up data to promoters in the past - this
modification should dramatically improve results when
attempting to do similar analysis in the future.
This change applies to all of the 'organism' packages
- Updated all organism and UCSC genome packages
HOMER v4.6 (3/29/14)
- perl scripts will now run perl from the PATH
instead of /use/bin/perl (modified shebangs)
- findPeaks now finds super enhancers with the "-style
super" option - works better now with large data
- getGWASOverlap.pl script has been fixed
(upgrades to mergePeaks broke it from before)
- findHiCDomains.pl has been improved and
- getPeakTags/findPeaks have been updated to work
faster and use less memory particularly for large data
HOMER v4.5 (1/27/14)
- Updated peak finding and read counting code
to be much more memory efficient (findPeaks, getPeakTags)
- findPeaks now finds super enhancers with the "-style
super" option (peak finding documentation is updated
- Using GTF files will now (by default) report the transcript_id
in the output file, not the gene_id (parseGTF.pl)
- Fixed bug so that annotatePeaks.pl will now
center peaks on motifs with the "-size given" option.
- Fixed edge effects with tag coverage and bedGraph/wig in
HOMER v4.5 (1/27/14)
- Fixed error in updateGeneIdentifiers.pl that
caused none of the key annotation files to be downloaded.
- Fixed bug in assignGenomeAnnotation introduced
in last version that provided incorrect annotation
priority assignment (i.e. TSS given priority over intron
annotations when overlapping, etc.)
- Modified updateUCSCGenomeAnnotation.pl slightly
to be smarter and more automated.
- Updated annotations in all genome packages
- Updated configureHomer.pl script to correctly
- Fixed website to provide access to old versions of the
- Fixed warnings during c++ compilation
HOMER v4.4 (1/14/14)
- Updated annotations
and system for data organization. Organism
accessions and GO are now managed separately from the
main code and promoters/genomes.
- Code for updating gene accessions, promoter locations,
genome annotations, etc. are now included in HOMER and
available in the homer/update/ directory.
- loadPromoters.pl and loadGenomes.pl
scripts now make it much easier to incorporate any
organism into HOMER.
- makeUCSCfile and annotatePeaks.pl
now normalized experiments to a fragment length of
100. Experiments with larger/smaller lengths are
normalized in bedGraphs and the 'Coverage' column of
- Changed defaults for mergePeaks to use the given
size of the peaks when merging ("-d given" is now
default, not 100 bp)
- Fixed rare bug with makeTagDirectory
that would cause some some chromosomes to change the
strand of all reads on the chromosome (Not many
reports from users, but could have happened in the last
- Added option to findMotifsGenome.pl and preparseGenome.pl
such that the user can choose the directory to store
preparsed files. This is useful when a system has
many users and a single, shared installation of
HOMER. Also, by default, the command will set the
permissions on the preparsed directories to be group
- Most programs now include the command line options used
in the output for better record keeping.
- Fixed error in mergePeaks when merging peaks
from a single peak file - previously there was potential
for problems when a peak file was completely within
another one (only affected variable length peaks)
- Added option with annotatePeaks.pl to store
annotation enrichment results ("-annStats
- Made compareMotifs.pl parallel so that checking
known motifs for matches is much faster if running findMotifs.pl/findMotifsGenome.pl
with multiple CPUs.
- findGO.pl (which is run by findMotifs.pl)
will now check ontologies in parallel will multiple CPUs.
- Change in line-up for GO enrichment (incorporates NCBI's
biosystems database which includes KEGG, reactome, etc.)
- Added support for finding super enhancers (findPeaks
- parseGTF.pl now removes any accession number
versioning (i.e. NM_012345.2 -> NM_012345) from
- Fixed annotatePeaks.pl so that if a custom genome is
used with an unknown organism, it will still try to add
gene information from the "-gene <file>" option.
- Conservation options (for phastCons plots) are being
phased out. New instructions on how to analyze
conservation will appear soon.
HOMER v4.3 (8/26/13)
- Added automation scripts (batchParallel.pl,
etc.) and documentation.
- Fixed issue with BED file processing with non-unique
IDs. (added -unique option to bed2pos.pl)
- analyzeRNA.pl now defaults to "-count genes" instead of
- findMotifsGenome.pl now requires that you
specify the -size parameter when running it.
- Removed duplicates in the known motif library
- Fixed error with pthread initialization that would cause
de novo motif finding to crash in rare circumstances (homer2,
- By default, hubs will not have a line drawn at zero
(looks a little more professional, makeMultiWigHub.pl)
- Added scrambleFasta.pl script, in case you don't
have a background file for motif finding. findMotifs.pl
no longer requires a background file, although
it's still highly recommended.
- Fixed bug in the reporting of FDR for de novo motif
finding (using -fdr <#> option).
Previously, the HTML output page would occasionally report
the FDR of motifs that were similar to the primary motif,
not the FDR calculation for the motif itself. The
"motif files" and "more information" page reported the
proper FDR before - now all report the correct value.
HOMER v4.2 (4/11/13)
- Fixed error in annotation that would lead
some peaks found right on the boundary of two different
annotations to be assigned the default (intergenic).
Fixed this and added more output statistics for annotation
regions, including the total amount of sequence assigned
to each annotation so that the expected annotation can be
calculated (annotatePeaks.pl, assignGenomeAnnotation)
- HOMER can now extract sequence information from a near
unlimited number of peak regions (previously it would slow
down if regions overlapped continuously across the
- Sequence extraction, QC, and peak finding routines have
had bugs fixed that arise when analyzing genomes with
thousands of scaffolds (makeTagDirectory, findPeaks,
- findPeaks fixed - fold threshold calculations relative
to input and local read density have been modified.
Previously, the minimum coverage in the background region
was set to the average genomic coverage. This has
been replaced with a pseudo count (0.5 reads) - this helps
with small genomes where the average genomic background
may be quite high. This change increases sensitivity
- fixed bug in motif comparison output after motif finding
- Many upgrades to analyzeRepeats.pl
- Gene Ontology result files for all ontologies now report
gene symbols instead of Entrez Gene IDs (easier for users
- Fixed problem with double-counting of isoforms for GO
analysis (findMotifs.pl, findGO.pl, annotatePeaks.pl)
- Arabidopsis annotation changes: Chromosomes now named
"1" instead of "Chr1" to be consistent with Ensembl
HOMER v4.1 (11/2/12)
- Make efficiency improvements to SIMA.pl
and runHiCpca.pl (correlation calculations) for
Hi-C. Added "-rawAndExpected <file>" output
option for analyzeHiC to allow simultaneous
reporting of raw and expected interactions at the same
- Enabled 2D historgrams to work correctly for Hi-C
interaction data (analyzeHiC with the "-hist
<#>" option). - Improved GC normalization options in
makeTagDirectory. New option "-iterNorm <#>" allows
for more precise normalization control
- fixed mergePeaks to allow merging of single peak
files that are strand specific.
- fixed "-gsize <#>" input parsing for findPeaks
(accepts scientific notation i.e. 2e9 now)
- fixed problem with findPeaks that will cause
peak coordinates to be negative or larger than the
chromosome when centering peaks.
- fixed file naming bug for QC files when parsing Hi-C
alignment files with makeTagDirectory
("LocalDistribution.txt" file was incorrectly named).
- fixed scaling bug for histograms in annotatePeaks.pl -
"Coverage" column may not be compatible with histograms
made with different bin sizes (fixed now)
HOMER v4.0 (10/15/12)
Incorportated Hi-C routines into HOMER (several programs
including analyzeHiC, runHiCpca.pl,
- release of new documentation for Hi-C analysis,
including updates to other parts of the annotation
HOMER v3.18 (10/2/12)
- Added support to annotatePeaks.pl
to quantify WIG file coverage at peaks ("-wig <WIG
- Fixed SAM parsing for paired-end files
HOMER v3.17 (9/15/12)
- pre-release of Hi-C
- additional option for SAM/BAM parsing in makeTagDirectory
("-unique", "-keepOne", "-keepAll").
HOMER v3.16 (8/15/12)
- Updates to analyzeRepeats.pl
to enable additional control over classes of repeats
- Added support for mCpG files fron Encode (makeTagDirectory
option "-format mCpGbed")
HOMER v3.15 (8/2/12)
- Added program analyzeRepeats.pl
to quantify reads in repeat regions (will likely replace
analyzeRNA.pl in the near future). - configureHomer.pl
now has options -bigWigDir, -bigWigUrl, -hubsDir, -hubsUrl
to set values used in makeBigWig.pl and makeMultiWigHub.pl
and stores them in the config.txt file so they will be
constant with future updates.
HOMER v3.14 (7/20/12)
- Updated and added species
specific motif libraries (vertebrates, insects, worms,
yeast, plants, all). findMotifs.pl and findMotifsGenome.pl
will try to auto detect the organism based on promoter
set/genome. Can be overridden with, "-mset setName". -
Modernized the analyzeChIP-Seq.pl script to handle
the major option for findPeaks/findMotifsGenome.pl/annotatePeaks.pl
- makeUCSCfile can now accept input experiments
("-i inputDirectory") to normalize the bedGraph files. To
avoid low coverage artifacts, psuedo counts are added when
performing the ratio calculation ("-pseudo #", default:
5). Can also report the log2 ratio ("-log").
HOMER v3.13.1 (7/18/12)
- Added routines for mC
analysis, makeTagDirectory can process encode
style methylation files ("-format mCpGbed") into mC tag
directories. These can be used with annotatePeaks.pl
to calculated methylation profiles and avg. methylation
content at peaks (when running annotatePeaks.pl use the
- Fixed bug in analyzeRNA.pl that would leave off
counting the first nucleotide of the gene
- Updated motif library, other small things
HOMER v3.13 (6/22/12)
- Update to annotation
system, better annotation for ncRNA, more accurate UTR
boundaries (some were off by a bp or two
- No more separate "masked" genomes - homerTools
extract now has option "-mask" that will replace
soft masked sequences (e.g. lowercase letters) with N.
Programs like findMotifsGenome.pl and annotatePeaks.pl
now have option "-mask" or will interpret hg18r as
shorthand for "hg18 ... -mask"
- In makeTagDirectory Fixed errors with CIGAR
parsing with SAM files - improved RNA-Seq bedGraph
visualization at splice junctions, use "-fragLength given"
with makeUCSCfile or makeBigWig.pl etc.
- Fixed a bunch of other stuff I can't remember...
HOMER v3.12 (6/8/12)
- Bugs fixed and small
- Fixed problem with mergePeaks crashing
HOMER v3.11 (5/21/12)
- Lots of bugs fixed and
small options added.
- Fixed inconsistencies with treating BED files as
maintains the peak order when making heatmaps
- Fixed IP efficiency calculation for "-style histone" or
"-region" in findPeaks
- Fixed bug with multi-processor support from some linux
distros (should crash anymore) with motif finding
- Fixed bug in getDifferentialPeaks
with -size parameter
HOMER v3.10 (3/22/12)
- Lots of bugs fixed and
small options added.
now works with bedGraph files in a manner similar to tag
directories (option "-bedGraph <file>").
- Added "-precision <1|2|3>" option to makeTagDirectory to
print values if format 1.0 or 1.00 or 1.000 (useful if
normalizing or using fractional tag counts)
- Fixed annotatePeaks.pl
"-center <motif>" option when using unbalanced peak
size (i.e. "-size -200,50").
- Added program removeOutOfBoundsReads.pl
to remove reads that are out of bounds, causing problems
for UCSC (some alignment programs have a tendency to do
- several new options added to compareMotifs.pl (scale heights of logos
to information content "-bits", skip similar
matching/visualization "-basic", etc.)
HOMER v3.9 (2/1/12)
- Fixed bigWig/hub creation
to work with updates at UCSC (makeBigWig.pl/makeMultiWigHub.pl) makeBigWig.pl now
requires that you enter the genome as an argument, and
when making bigWigs with makeUCSCfile,
you need to specify the chrom.sizes file (makeBigWig.pl
and makeMultiWigHub.pl take care of this automatically)
can now process bedGraph files just like a tag directory
by using the "-bedGraph" option (i.e. make histograms,
calculate read density, etc.)
HOMER v3.8.2 (1/6/12)
- Fixed issue with findPeaks using too
much memory. Added option "-minTagThreshold
<#>" that controls the smallest peaks to
consider. By default this is set at the uniform
density (i.e. expected tags per peak region given a
uniform tag coverage)
- Fixed bug with findMotifsGenome.pl/findMotifs.pl/homer2
"-cache <#>" that caused a crash if too large of a
cache was specified.
- Fixed bug with 5' adapter trimming and added the option
to trim adapter sequence while allowing mismatches with homerTools.
HOMER v3.8.1 (11/30/11)
- (3.8.1) Fixed sequence
parsing issues for short sequences for de novo
finding, added support for % based histograms over
variable length regions (i.e. gene bodies)
- Added support for UCSC Hub
- Modified general routines to work better with large
numbers of chromosomes (i.e. genomes composed of scaffolds
like X. tropicalis)
- Added support for Arabidopsis (tair10) and
X. tropicalis (xenTro2)
- Updated annotations
- Fixed bug with CpG/GC% calculations in annotatePeaks.pl
dealing with variable
- "-forceBED" option is now standard for parsing BED files
of sequence read alignments (makeTagDirectory
), new option "-force5th"
to use 5th column of BED file as read count
- Fixed issue with auto-detecting BAM files
HOMER v3.7 (11/02/11)
- Added q-value/FDR
calculations to de novo motif finding.
Unfortunately, due to the complexity behind the de novo algorithm,
the only way to do this is to calculate it empirically by
randomizing the data and recalculaing motifs. As a
result, it takes a long time to calculate FDR values
(option -fdr <#>, # is the number of randomizations,
- Fixed bug in findMotifsGenome.pl causing the option
"-size given" to crash.
- Fixed calculation of peak overlap significance in mergePeaks, in cases
where "-d #" is used.
HOMER v3.6 (10/12/11)
- Fixed bugs in mergePeaks, v3.5 would
crash in some extreme cases. Fixed significance
calculations for peak overlaps (again) to deal with nearby
peaks from the same file.
- Changed the way "-matrix <filename>" works with mergePeaks/annotatePeaks.pl.
- Added q-values (Benjamini multiple hypothesis testing
corrections) to know motif finding (findMotifs.pl/findMotifsGenome.pl)
- fixed bug in findMotifs.pl that causes problems with
custom promoter sets and failure to output de novo motifs
HOMER v3.5 (10/06/11)
- Changed how findPeaks
interprets genomes size: When using "-gsize <#>",
use the number of mappable bases, not 2x the number as
- Added "-nfr" flag to findPeaks
to help find nucleosome free regions in histone
modification data (works best with MNase datasets)
- Fixed bug in mergePeaks
- when merging peaks when several peaks in the same file
are within range (i.e. merging transcription factor peaks
within 100000 bp), latest version would sometimes crash.
- mergePeaks now
outputs the total number of peaks that contribute to each
peak in the 8th column.
- fixed bug with findPeaks
- normally findPeaks uses the average coverage of an
experiment as the minimum when considering the enrichment
over input signal - this value was divided by 2 in the
previous versions. Weak ChIP-seq experiments will
likely see less peaks in the output file now when using
input to filter the experiment (but the peaks you do get
back will be better)
HOMER v3.4 (09/30/11)
- Fixed problem with motif statistic
reporting for de novo motifs (findMotifsGenome.pl, findMotifs.pl, homer2). Previous
version calculated motif percentages using sequence that
had more significant motifs masked, causing some instances
to be missed. HOMER now reports the % of sequences
containing the sequence using the original
sequences. The motifs found by the algorithm
themselves are uneffected, just the reported statistics
have changed. This change on reflects statistics
found in the homerResults.html file. Other tasks,
such as searching for motifs, are uneffected.
- Fixed GTF format parsing when the file is not sorted
properly (caused some annotation to be dropped from
consideration if the file wasn't presorted).
- Chuck facts must now be installed separately using the configureHomer.pl
HOMER v3.3 (09/28/11)
- makeTagDirectory will
now take gzipped (*.gz), zipped (*.zip), bzip2(*.bz2), and
bam (*.bam) files directly - no need to decompress
them. samtools needs to be installed and available
on the executable path for homer to work with *.bam files
- Fixed total mapped tag normalization when running annotatePeaks.pl or analyzeRNA.pl with
options like "-pc <#>" that limit the number of
reads considered an a specific position. These
programs now reference the "tagCountDistribution.txt" file
in the tag directory to properly scale the total number of
tags to be compatible with the limiting function.
- fixed problem with mergePeaks
- peaks from the same file within the distance ("-d #")
are also merged.
- mergePeaks now
outputs how many peaks were merged in each output peak
(useful when merging over large distances)
- mergePeaks can now work with a single peak file to merge
peaks found within a given distance of one another.
Can also be used to filter a peak file for peaks found
within a given region.
- Slight modifications made to the calculations for peak
overlap significance in mergePeaks
(When using "-matrix ..." option). Total coverage of
the peaks is not calculated to adjust numbers when peaks
are overlapping (not done before, might have been a
problem for a peak file with many overlapping peaks)
- Option for excluding Chuck Facts added ("-nofacts") to findMotifs.pl, findMotifsGenome.pl,
To permanently remove them, remove the file in
which is used by the de
novo motif finding programs to determine the
similarity between de
novo and known motifs, has been modified such
that Pearson's correlation is the default for comparing
motif matrices. When comparing matrices of different
length they are elongated with 0.25 frequencies.
Overall an improvement based on what a human would
HOMER v3.2 (08/11/11)
- Fixed problem with peak annotation (if peaks
are overlapping some would not be annotated)
- Lots of small things, such as file format detection,
HOMER v3.1 (05/25/11)
- Added "easy" Custom Genome support.
Instead of specifying a "genome" such as "hg18" for
programs such as findMotifsGenome.pl or makeTagDirectory,
you can specify the path to the genomic FASTA files
(either a single file or a directory with FASTA files
named for each chromosome).
- No longer need a "reference file" for preparsing the
genome, now it will just randomly determine regions if one
is not provided or cannot be found (i.e. if you have a
option will automatically split up large background
regions to the size of the target regions. So if
you're too lazy to explicitely select regions as
background, you can provide a large FASTA file (with findMotifs.pl
mode) or a large region (i.e. a whole chromosome in findMotifsGenome.pl
and the "-chopify" option will tell HOMER to chop up the
region into smaller, target-sized, chunks.
for motif finding to output RNA style motifs and
automatically searches only the + strand.
- Fixed SAM format auto detection. Works better now
(If you have BAM, use samtools
convert BAM formated files to SAM)
- Fixed problem with automatic genome size dectection with
smaller genomes in findPeaks
HOMER v3.0 (05/09/11)
- New motif finding program homer2
masking strategy increases sensitivity for finding
- Added zebrafish
(danRer7) and yeast (sacCer2) support.
- Added GRO-Seq
analysis routines (findPeaks, analyzeRNA.pl)
- Added read
normalization and bundled QC into makeTagDirectory
Autonormalization helps reduce problems caused by
sequence composition bias
- Bunch of other stuff...
HOMER v2.7 (12/14/10)
- added support for parsing
alignment in SAM format. (If you have BAM, use samtools
convert BAM formated files to SAM)
HOMER v2.6 (10/21/10)
program for making histograms across multiple peak files.
- findPeaks now has "-region" mode for identifing
variable-length regions of signal enrichment.
- tagDir2bed.pl script available to easily export tag data
into bed file format
- findMotifsGenome.pl and annotatePeaks.pl are now BED
file compliant! Since everyone uses those, I guess Chuck
can too. But his still prefers position/peak
files! Not all subprograms work with BED files, so
you may still need to use bed2pos.pl to switch formats to
do certain tasks.
HOMER v2.5 (10/11/10)
- Fixed errors in mergePeaks script (gave
negative chromosome coordinates before if compiled on
- No longer keep around temporary/sequence files from the
- Added wikipathways to GO analysis
- Fixed bug in clonal filtering with peak finding (affects
highly clonal experiments/lower organism analysis)
- getDistalPeaks.pl program now finds intra/inter-genic
- updated promoter sets to use refseq
- New error checking for FASTA file input (previously
HOMER required only ACGTN characters).
HOMER v2.4 (08/30/10)
- New annotation system
(2 tier -
promoter/exon/intron/intergenic and detailed i.e. repeats,
etc.). Also, fixed a slight bug in annotation
- Genomic Gene annotations standardized on refGene.txt
file from UCSC. (includes miRNA and some other non-coding
- New GenomeOntology.pl program
significance association calculations with genomic
annotations like repeats, other peak files, etc.
- Annotations for
exons/introns/promoters etc., repeats, gene deserts/gene
rich regions, GO terms, peaks from published data.
- Works with both peaks and with tag directories (for
peak independent analysis)
- Added as an
- New mergePeaks
- rewritten in C++ with added
- Added strand specific tag counting to annotatePeaks.pl
- Added analyzeRNA.pl
program to compute gene expression levels from RNA-Seq
Will also work for repeats, but need
a lot of memmory.
- Added tts mode (to go along with tss mode) in
- Added "motifFindingParameters.txt" file that remembers
which parameters where used during motif finding
- Added "bias motifs" to known libraries help users
identify motifs that are likely come from sequence bias or
are just garbage.
- Added strand
support to makeUCSCfile for UCSC genome browser tag
pile-ups - fixed
problem with tag extensions exceeding the size of the
- Added MSigDB links
and annotations to GO analysis.
- Updated all
annotations, accessions as of 8/30/2010.
- Fixed error in histogram creation
(stupid round-off error shifted the location of some
- assignGenomeAnnotation program rewritten in C++ for a
dramatic speed up.
- Fixed problem of findMotifsGenome.pl crashing if
non-unique peak ids are used
- Fixed redundant ID creation when generating background
sequences (minor issue)
- Added peak file checking (i.e. for non-redundant IDs)
- Added more known motifs
- Bunch of other minor stuff I can't remember...