Intersection and union of two sets

Documentation

  1. Class set in the Python Standard Library
  2. Sets in the Python Tutorial

In lines 128–133, use zip when the columns are of equal length, itertools.zip_longest when the columns are of unequal lengths. We saw zip in Zip.

intersect.py

Only POTUS             POTUS & VP             Only VP
----------             ----------             -------
Abraham Lincoln        Andrew Johnson         Aaron Burr
Andrew Jackson         Calvin Coolidge        Adlai Stevenson
Barack Obama           Chester A. Arthur      Al Gore
Benjamin Harrison      George H. W. Bush      Alben W. Barkley
Bill Clinton           Gerald Ford            Charles Curtis
Donald Trump           Harry S. Truman        Charles G. Dawes
Dwight D. Eisenhower   John Adams             Charles W. Fairbanks
Franklin D. Roosevelt  John Tyler             Dan Quayle
Franklin Pierce        Lyndon B. Johnson      Daniel D. Tompkins
George W. Bush         Martin Van Buren       Dick Cheney
George Washington      Millard Fillmore       Elbridge Gerry
Grover Cleveland       Richard Nixon          Garret Hobart
Herbert Hoover         Theodore Roosevelt     George Clinton
James A. Garfield      Thomas Jefferson       George M. Dallas
James Buchanan                                Hannibal Hamlin
James K. Polk                                 Henry A. Wallace
James Madison                                 Henry Wilson
James Monroe                                  Hubert Humphrey
Jimmy Carter                                  James S. Sherman
John F. Kennedy                               Joe Biden
John Quincy Adams                             John C. Breckinridge
Ronald Reagan                                 John C. Calhoun
Rutherford B. Hayes                           John N. Garner
Ulysses S. Grant                              Levi P. Morton
Warren G. Harding                             Mike Pence
William Henry Harrison                        Nelson Rockefeller
William Howard Taft                           Richard M. Johnson
William McKinley                              Schuyler Colfax
Woodrow Wilson                                Spiro Agnew
Zachary Taylor                                Thomas A. Hendricks
                                              Thomas R. Marshall
                                              Walter Mondale
                                              William A. Wheeler
                                              William R. King

Things to try

  1. Sort by last name, then by the rest of the name. score can’t be a lambda function because it is called in more than one place.
    def score(name):
        """
        Return the name with the last name moved to the front.
        For example, "Lyndon B. Johnson" becomes "Johnson Lyndon B.".
        """
        restOfName, lastName = name.rsplit(maxsplit = 1)
        return f"{lastName} {restOfName}"
    
    threeColumns = itertools.zip_longest(
        sorted(onlyPresidents,     key = score),
        sorted(intersection,       key = score),
        sorted(onlyVicePresidents, key = score),
        fillvalue = ""
    )
    
    Only POTUS             POTUS & VP             Only VP
    ----------             ----------             -------
    John Quincy Adams      John Adams             Spiro Agnew
    James Buchanan         Chester A. Arthur      Alben W. Barkley
    George W. Bush         Martin Van Buren       Joe Biden
    Jimmy Carter           George H. W. Bush      John C. Breckinridge
    Grover Cleveland       Calvin Coolidge        Aaron Burr
    Bill Clinton           Millard Fillmore       John C. Calhoun
    Dwight D. Eisenhower   Gerald Ford            Dick Cheney
    James A. Garfield      Thomas Jefferson       George Clinton
    Ulysses S. Grant       Andrew Johnson         Schuyler Colfax
    Warren G. Harding      Lyndon B. Johnson      Charles Curtis
    Benjamin Harrison      Richard Nixon          George M. Dallas
    William Henry Harrison Theodore Roosevelt     Charles G. Dawes
    Rutherford B. Hayes    Harry S. Truman        Charles W. Fairbanks
    Herbert Hoover         John Tyler             John N. Garner
    Andrew Jackson                                Elbridge Gerry
    John F. Kennedy                               Al Gore
    Abraham Lincoln                               Hannibal Hamlin
    James Madison                                 Thomas A. Hendricks
    William McKinley                              Garret Hobart
    James Monroe                                  Hubert Humphrey
    Barack Obama                                  Richard M. Johnson
    Franklin Pierce                               William R. King
    James K. Polk                                 Thomas R. Marshall
    Ronald Reagan                                 Walter Mondale
    Franklin D. Roosevelt                         Levi P. Morton
    William Howard Taft                           Mike Pence
    Zachary Taylor                                Dan Quayle
    Donald Trump                                  Nelson Rockefeller
    George Washington                             James S. Sherman
    Woodrow Wilson                                Adlai Stevenson
                                                  Daniel D. Tompkins
                                                  Henry A. Wallace
                                                  William A. Wheeler
                                                  Henry Wilson
    
    Would rpartition be simpler than rsplit?
  2. Don’t hardcode the number 22 into line 136 of the above script. Instead, change lines 135–141 to the following. The format string f now contains five pairs of curly braces, so each call to format now takes five arguments. See “nested replacement fields” in Format String Syntax.
    maxlen = max([len(name) for name in union]) #list comprehension
    maxlen = len(max(union, key = len))         #simpler way to get the same answer
    f = "{:{}} {:{}} {}"
    
    print(f.format("Only POTUS", maxlen, "POTUS & VP", maxlen, "Only VP"))
    print(f.format("----------", maxlen, "----------", maxlen, "-------"))
    
    for left, middle, right in threeColumns:
        print(f.format(left, maxlen, middle, maxlen, right))
    
  3. """
    List the letters that are missing from the string.
    """
    
    import sys
    import string
    
    s = "Pack my box with five dozen liquor jugs." #pangram
    
    #Prep the patient for surgery.
    listOfLetters = [c for c in s if c.isalpha()]  #listOfLetters is a list of one-character strings.
    stringOfLetters = "".join(listOfLetters)       #stringOfLetters is a string.
    s = stringOfLetters.lower()
    
    setOfMissingLetters = set(string.ascii_lowercase) - set(s) #or setOfMissingLetters = set(string.ascii_lowercase).difference(s)
    listOfMissingLetters = sorted(setOfMissingLetters)
    stringOfMissingLetters = "".join(listOfMissingLetters)
    
    if stringOfMissingLetters: #true if the stringOfMissingLetters contains at least one character
        print(f'The following letters are missing: "{stringOfMissingLetters}"')
        sys.exit(1)
    else:
        print("No letters are missing.")
        sys.exit(0)
    
    No letters are missing.
    
  4. Output the names of the /~meretzkm/python/ files that have never been served by the web server on oit2.scps.nyu.edu. See Served.
    import os
    
    #Create a set of the names of all the /~meretzkm/python/ files
    #on the server oit2.scps.nyu.edu.
    
    filenames = set()   #Start with an empty set.
    
    for dir in os.walk("/home/m/meretzkm/public_html/python"):
        dirname = dir[0]
        for filename in dir[2]:
            filenames.add("/~meretzkm/" + os.path.join(dirname, filename))
    
    for filename in filenames:
        print(filename)
    
    Then subtract from this set (as in lines 121–125 of intersect.py) all the files that have been served by the web server. Also, replace the /home/m/meretzkm with the os..expanduser we saw in Binary.