net.sf.picard.util
Class IntervalList

java.lang.Object
  extended by net.sf.picard.util.IntervalList
All Implemented Interfaces:
Iterable<Interval>

public class IntervalList
extends Object
implements Iterable<Interval>

Represents a list of intervals against a reference sequence that can be written to and read from a file. The file format is relatively simple and reflects the SAM alignment format to a degree. A SAM style header must be present in the file which lists the sequence records against which the intervals are described. After the header the file then contains records one per line in text format with the following values tab-separated: Sequence name, Start position (1-based), End position (1-based, end inclusive), Strand (either + or -), Interval name (an, ideally unique, name for the interval),

Author:
Tim Fennell, Yossi Farjoun

Constructor Summary
IntervalList(SAMFileHeader header)
          Constructs a new interval list using the supplied header information.
 
Method Summary
 void add(Interval interval)
          Adds an interval to the list of intervals.
 void addall(Collection<Interval> intervals)
          Adds a Collection of intervals to the list of intervals.
static IntervalList concatenate(Collection<IntervalList> lists)
          A utility function for merging a list of IntervalLists, checks for equal dictionaries.
static IntervalList copyOf(IntervalList list)
          creates a independent copy of the given IntervalList
static IntervalList difference(Collection<IntervalList> lists1, Collection<IntervalList> lists2)
          A utility function for finding the difference between two IntervalLists.
static IntervalList fromFile(File file)
          Parses an interval list from a file.
static IntervalList fromReader(BufferedReader in)
          Parses an interval list from a reader in a stream based fashion.
static IntervalList fromVcf(File file)
          Parse a VCF file and convert to an IntervalList The name field of the IntervalList is taken from the ID field of the variant, if it exists.
static IntervalList fromVcf(VCFFileReader vcf)
          Converts a vcf to an IntervalList.
 long getBaseCount()
          Gets the (potentially redundant) sum of the length of the intervals in the list.
 SAMFileHeader getHeader()
          Gets the header (if there is one) for the interval list.
 List<Interval> getIntervals()
          Gets the set of intervals as held internally.
 long getUniqueBaseCount()
          Gets the count of unique bases represented by the intervals in the list.
 List<Interval> getUniqueIntervals()
          Deprecated. 
 List<Interval> getUniqueIntervals(boolean concatenateNames)
          Deprecated. 
static List<Interval> getUniqueIntervals(IntervalList list, boolean concatenateNames)
          Merges list of intervals and reduces them like net.sf.picard.util.IntervalList#getUniqueIntervals()
static IntervalList intersection(Collection<IntervalList> lists)
          A utility function for intersecting a list of IntervalLists, checks for equal dictionaries.
static IntervalList intersection(IntervalList list1, IntervalList list2)
          A utility function for generating the intersection of two IntervalLists, checks for equal dictionaries.
static IntervalList invert(IntervalList list)
          inverts an IntervalList and returns one that has exactly all the bases in the dictionary that the original one does not.
 Iterator<Interval> iterator()
          Returns an iterator over the intervals.
 int size()
          Returns the count of intervals in the list.
 void sort()
          Deprecated. 
 IntervalList sorted()
          returns an independent sorted IntervalList
static IntervalList subtract(Collection<IntervalList> listsToSubtractFrom, Collection<IntervalList> listsToSubtract)
          A utility function for subtracting a collection of IntervalLists from another.
static IntervalList union(Collection<IntervalList> lists)
          A utility function for finding the union of a list of IntervalLists, checks for equal dictionaries.
static IntervalList union(IntervalList list1, IntervalList list2)
           
 void unique()
          Deprecated. 
 void unique(boolean concatenateNames)
          Deprecated. 
 IntervalList uniqued()
          Returned an independent IntervalList that is sorted and uniquified.
 IntervalList uniqued(boolean concatenateNames)
          Returned an independent IntervalList that is sorted and uniquified.
 void write(File file)
          Writes out the list of intervals to the supplied file.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

IntervalList

public IntervalList(SAMFileHeader header)
Constructs a new interval list using the supplied header information.

Method Detail

getHeader

public SAMFileHeader getHeader()
Gets the header (if there is one) for the interval list.


iterator

public Iterator<Interval> iterator()
Returns an iterator over the intervals.

Specified by:
iterator in interface Iterable<Interval>

add

public void add(Interval interval)
Adds an interval to the list of intervals.


addall

public void addall(Collection<Interval> intervals)
Adds a Collection of intervals to the list of intervals.


sort

@Deprecated
public void sort()
Deprecated. 

Sorts the internal collection of intervals by coordinate.


sorted

public IntervalList sorted()
returns an independent sorted IntervalList


uniqued

public IntervalList uniqued()
Returned an independent IntervalList that is sorted and uniquified.


uniqued

public IntervalList uniqued(boolean concatenateNames)
Returned an independent IntervalList that is sorted and uniquified.

Parameters:
concatenateNames - If false, interval names are not concatenated when merging intervals to save space.

unique

@Deprecated
public void unique()
Deprecated. 

Sorts and uniques the list of intervals held within this interval list.


unique

@Deprecated
public void unique(boolean concatenateNames)
Deprecated. 

Sorts and uniques the list of intervals held within this interval list.

Parameters:
concatenateNames - If false, interval names are not concatenated when merging intervals to save space.

getIntervals

public List<Interval> getIntervals()
Gets the set of intervals as held internally.


getUniqueIntervals

@Deprecated
public List<Interval> getUniqueIntervals()
Deprecated. 

Merges the list of intervals and then reduces them down where regions overlap or are directly adjacent to one another. During this process the "merged" interval will retain the strand and name of the 5' most interval merged. Note: has the side-effect of sorting the stored intervals in coordinate order if not already sorted.

Returns:
the set of unique intervals condensed from the contained intervals

getUniqueIntervals

public static List<Interval> getUniqueIntervals(IntervalList list,
                                                boolean concatenateNames)
Merges list of intervals and reduces them like net.sf.picard.util.IntervalList#getUniqueIntervals()

Parameters:
concatenateNames - If false, the merged interval has the name of the earlier interval. This keeps name shorter.

getUniqueIntervals

@Deprecated
public List<Interval> getUniqueIntervals(boolean concatenateNames)
Deprecated. 

Merges list of intervals and reduces them like net.sf.picard.util.IntervalList#getUniqueIntervals()

Parameters:
concatenateNames - If false, the merged interval has the name of the earlier interval. This keeps name shorter.

getBaseCount

public long getBaseCount()
Gets the (potentially redundant) sum of the length of the intervals in the list.


getUniqueBaseCount

public long getUniqueBaseCount()
Gets the count of unique bases represented by the intervals in the list.


size

public int size()
Returns the count of intervals in the list.


fromVcf

public static IntervalList fromVcf(File file)
Parse a VCF file and convert to an IntervalList The name field of the IntervalList is taken from the ID field of the variant, if it exists. if not, creates a name of the format interval-n where n is a running number that increments only on un-named intervals

Parameters:
file -
Returns:

fromVcf

public static IntervalList fromVcf(VCFFileReader vcf)
Converts a vcf to an IntervalList. The name field of the IntervalList is taken from the ID field of the variant, if it exists. if not, creates a name of the format interval-n where n is a running number that increments only on un-named intervals

Parameters:
vcf - the vcfReader to be used for the conversion
Returns:
an IntervalList constructed from input vcf

copyOf

public static IntervalList copyOf(IntervalList list)
creates a independent copy of the given IntervalList

Parameters:
list -
Returns:

fromFile

public static IntervalList fromFile(File file)
Parses an interval list from a file.

Parameters:
file - the file containing the intervals
Returns:
an IntervalList object that contains the headers and intervals from the file

fromReader

public static IntervalList fromReader(BufferedReader in)
Parses an interval list from a reader in a stream based fashion.

Parameters:
in - a BufferedReader that can be read from
Returns:
an IntervalList object that contains the headers and intervals from the file

write

public void write(File file)
Writes out the list of intervals to the supplied file.

Parameters:
file - a file to write to. If exists it will be overwritten.

intersection

public static IntervalList intersection(IntervalList list1,
                                        IntervalList list2)
A utility function for generating the intersection of two IntervalLists, checks for equal dictionaries.

Parameters:
list1 - the first IntervalList
list2 - the second IntervalList
Returns:
the intersection of list1 and list2.

intersection

public static IntervalList intersection(Collection<IntervalList> lists)
A utility function for intersecting a list of IntervalLists, checks for equal dictionaries.

Parameters:
lists - the list of IntervalList
Returns:
the intersection of all the IntervalLists in lists.

concatenate

public static IntervalList concatenate(Collection<IntervalList> lists)
A utility function for merging a list of IntervalLists, checks for equal dictionaries. Merging does not look for overlapping intervals nor uniquify

Parameters:
lists - a list of IntervalList
Returns:
the union of all the IntervalLists in lists.

union

public static IntervalList union(Collection<IntervalList> lists)
A utility function for finding the union of a list of IntervalLists, checks for equal dictionaries. also looks for overlapping intervals, uniquifies, and sorts (by coordinate)

Parameters:
lists - the list of IntervalList
Returns:
the union of all the IntervalLists in lists.

union

public static IntervalList union(IntervalList list1,
                                 IntervalList list2)

invert

public static IntervalList invert(IntervalList list)
inverts an IntervalList and returns one that has exactly all the bases in the dictionary that the original one does not.

Parameters:
list - an IntervalList
Returns:
an IntervalList that is complementary to list

subtract

public static IntervalList subtract(Collection<IntervalList> listsToSubtractFrom,
                                    Collection<IntervalList> listsToSubtract)
A utility function for subtracting a collection of IntervalLists from another. Resulting loci are those that are in the first collection but not the second.

Parameters:
listsToSubtractFrom - the collection of IntervalList from which to subtract intervals
listsToSubtract - the collection of intervals to subtract
Returns:
an IntervalLists comprising all loci that are in first collection but not second.

difference

public static IntervalList difference(Collection<IntervalList> lists1,
                                      Collection<IntervalList> lists2)
A utility function for finding the difference between two IntervalLists.

Parameters:
lists1 - the first collection of IntervalLists
lists2 - the second collection of IntervalLists
Returns:
the difference between the two intervals, i.e. the loci that are only in one IntervalList but not both