Columbia University information, such as course descriptions, faculty listings, addresses, news and other communications, usually resides on departmental websites–and while some information may be printed or stored in other formats–the bulk has been disseminated solely through the Internet during the past ten years.
During the transition from print to web-only course catalogues in the early 2000s, administrators and librarians were concerned over the risk of losing this information without a system in place to capture and preserve Columbia University’s web content.
The Internet Archive offered a solution with its WayBack Machine, launched in 2001, allowing the public to search and access its saved websites. Beginning in 2006, the Internet Archive began to offer a subscription web archiving service, Archive-It. Columbia University enrolled in this service beginning in 2010, although the Internet Archive crawled many Columbia University websites prior to the University’s subscription.
If you know a website’s URL, simply enter it into the WayBack Machine.
For example, searching Columbia University’s Program in Physical Therapy website, the url www.ps.columbia.edu/education/academic-programs/programs-physical-therapy/ will return a timeline and number of captures. You can click the date to view the saved webpage as seen here:
Captures of the site only extends back to 2019 since the URL changed mid-year.
One way around the problem of constantly changing URLs is to search Columbia Libraries Archive-It site, which indexes according to “creator” and “collector,” with the Augustus C. Long Library listed under collector University Archives .
You can go directly to the old websites from this list, in addition to other access points such as department, creator, school, and captures.
Back to the WayBack Machine
With the old URL, you can also plug it into the WayBack Machine to find if any were captured before 2013—which it does, going back to 2002:
In this way, the Archive-it service is an important, additional tool in searching for past websites. To find an earliest captured webpage, it is important to know that the Internet Archive has been crawling and saving www.columbia.edu since 1996 and has been crawling websites with the columbia.edu domain since 2010.
With this in mind, another strategy is to search columbia.edu and see how far you can navigate within the archived site:
For example, the Health Sciences link leads to:
The Internet Archive also offers browser plugins to quickly navigate and share “archived” websites—an important tool when encountering defunct websites or “link rot.”
It has now become more common to use the archived website from the Internet Archive as the citation source due to these dead links. Granted, not all institutions and individuals allow their sites to be crawled by the Interest Archive, and may use other tools for capturing and saving websites. Yet, depending on the citation tool and standard, citing archived sites can help ensure sources remain reliable. Citation integration will continue to evolve with the popularity and growth of the Internet Archive.