Skip to content

Add methods on already-sorted sequences that remove or count duplicates. #257

New issue

Have a question about this project? Sign up for a free account to open an issue and contact its maintainers and the community.

By clicking “Sign up for ”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on ? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

CTMacUser
Copy link
Contributor

Description

Add two methods to sequences. Each assumes that the receiver has its elements already sorted along the given predicate (or the default < operator). They have eager versions on Sequence and lazy versions on LazySequenceProtocol. The withoutSortedDuplicates method family removes all the elements of each run of identical values except for its first. The countSortedDuplicates method family reuses that algorithm loop to return each unique element with their respective count.

Detailed Design

extension Sequence {
  /// Assuming this sequence is already sorted along the given predicate,
  /// return an array of each unique element paired with its number of
  /// occurances.
  @inlinable
  public func countSortedDuplicates(
    by areInIncreasingOrder: (Element, Element) throws -> Bool
  ) rethrows -> [(value: Element, count: Int)]

  /// Assuming this sequence is already sorted along the given predicate,
  /// return an array of each unique element, by equivalence class.
  @inlinable
  public func withoutSortedDuplicates(
    by areInIncreasingOrder: (Element, Element) throws -> Bool
  ) rethrows -> [Element]
}

extension Sequence where Element: Comparable {
  /// Assuming this sequence is already sorted,
  /// return an array of each unique value paired with its number of
  /// occurances.
  @inlinable
  public func countSortedDuplicates() -> [(value: Element, count: Int)]

  /// Assuming this sequence is already sorted,
  /// return an array of the first elements of each unique value.
  @inlinable
  public func withoutSortedDuplicates() -> [Element]
}

extension LazySequenceProtocol {
  /// Assuming this sequence is already sorted along the given predicate,
  /// return a sequence that will lazily generate each unique
  /// element paired with its number of occurances.
  @inlinable
  public func countSortedDuplicates(
    by areInIncreasingOrder: @escaping (Element, Element) -> Bool
  ) -> LazyCountDuplicatesSequence<Elements>

  /// Assuming this sequence is already sorted along the given predicate,
  /// return a sequence that will lazily vend each unique element.
  @inlinable
  public func withoutSortedDuplicates(
    by areInIncreasingOrder: @escaping (Element, Element) -> Bool
  ) -> some (Sequence<Element> & LazySequenceProtocol)
}

extension LazySequenceProtocol where Element: Comparable {
  /// Assuming this sequence is already sorted,
  /// return an array of each unique value paired with its number of
  /// occurances.
  @inlinable
  public func countSortedDuplicates() -> LazyCountDuplicatesSequence<Elements>

  /// Assuming this sequence is already sorted,
  /// return a sequence that will lazily vend each unique value.
  @inlinable
  public func withoutSortedDuplicates() -> some (
    Sequence<Element> & LazySequenceProtocol
  )
}

/// Lazily vends the count of each run of duplicate values from
/// a sorted source.
public struct LazyCountDuplicatesSequence<Base: Sequence>
 : LazySequenceProtocol { /*...*/}

/// Vends the count of each run of duplicate values from a sorted source.
public struct CountDuplicatesIterator<Base: IteratorProtocol>
 : IteratorProtocol { /*...*/}

Documentation Plan

A guide file has been provided. And other parts of the documentation have been adjusted.

Test Plan

A test file has been provided.

Source Impact

The changes are strictly additive.

Checklist

  • I've added at least one test that validates that my change is working, if appropriate
  • I've followed the code style of the rest of the project
  • I've read the Contribution Guidelines
  • I've updated the documentation if necessary

Add a method for already-sorted sequences that returns each unique value, paired with the count of each value. Add another method that returns each unique value. Have both eager and lazy variants for each method.
Sign up for free to join this conversation on . Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
@CTMacUser