Monthly Archives: May 2007

The Lana Turner Diet (i.e. Aspect Ratios and Normalization)

While encoding a series of video clips for Seth Fein yesterday, I noticed that the final output had two large black bars on either side of the frame. Looking back at some of the older clips, it seemed that most of them contained these black bars as well. Not only was this an inefficient method of digitization (black bars eat up file space and degrade overall video quality), but it also seemed to be horizontally “squeezing” the image.

Initially, I thought this might be a problem with the source media (in this case a DVD). If the bars were on the raw video, they could be easily cropped. But no such luck. I then traced the file through each step of the encoding process to determine where the bars were introduced. The standardized process for the Fein videos consists of about 30 steps, during which the raw video is transcoded four times by three separate programs.

1.) The video is captured by EDIUS in raw DV format.

2.) The raw EDIUS file is then “printed” to an .avi (also DV)

3.) The “printed” .avi is loaded into Sound Forge, its audio track is normalized, and then saved.

4.) The normalized .avi is transcoded into separate .wmv and .rm files using ProCoder

The bars were introduced, curiously enough, while saving the normalized file in Sound Forge. The reason for this is complex. But, in essence, it’s because DV format is standardized at 720×480, which is not a true 4:3 aspect ratio if we’re talking about square pixels (for more info, see this website). This is why DV captures sometimes look “stretched.” Sound Forge is confused by, and attempts to fix the ratio, hence the black bars. If the DV file is not normalized, and sent directly through ProCoder, the latter program correctly adjusts the ratio to 4.3 (which was the correct ratio for all of the Fein movies on which I was working).

Here are three clips of John Garfield and Lana Turner to illustrate the different stages of the process.

A screen capture of the DVD:
lana turner - dvd

The slightly elongated DV output from EDIUS:
lana turner - fat

The black bar version produced by Sound Forge:
lana turner - thin

So this raises the question – what to do about audio normalization? If unprocessed, the audio track on most of these old movies remains low and uneven. ProCoder has a normalization filter that can be applied during the final stage (thus cutting Sound Forge completely out of the process). Unfortunately, however, the ProCoder filter is terrible. It does not improve sound quality nearly as well as Sound Forge, no matter how it is configured.

So this leaves Sound Forge. I’ve discovered that the black bars problem arises when you “save” the .avi. But if you use the “save as” command and select “Default Template (uncompressed),” it will maintain the original aspect ratio and no black bars will be produced.

Of course, this is far from an ideal solution. I’m still encoding the film twice before the final pass! And there is noticeable quality degradation in the video stream after saving it in Sound Forge. Another alternative would be to extract the audio from the .avi, normalize it separately in Sound Forge, and then re-multiplex the .avi. I need to investigate this further, and look into other options to streamline normalization (especially on macs, if we’re going to be switching from .wmv to Quicktime).

To be continued…

Karen Foster digital images

Prof. Foster still teaches with slides. Each week on Wed. she would leave the 30 to 40 slides she used in class in her mailbox. The Thursday student assistant would pick up those slides and digitize the weeks worth of images. Those images were posted on the Quick Dig site. The student would grab the images and create a static html page from a template and post them to The process took 45 minutes to an hour. The total number of digitized slides was around 800. It is possible that next year’s courses (though I don’t know what Foster is teaching) could have 800 different slides. Given that the digitized objects have no meta data and no way to search to put them into groups Mary felt it would probably be easier to digitize them again.

Christopher Wood Portfolio, course and classroom images

Need to assist in downloading the catalog to his local machine. If he adds images to the local catalog or does any updating the local and the served catalog will have to be synched at some point. Mary always did this for him.

Needs the updated OIV on his machine.

Need to go over the xmp data and how to auto populate data fields – the vrc needs to do this for all images in the size 4 folder.

Ask if he wants to disable the lectures on netpublish

once he has his room assignment test the projection system and ask for Arthur G(?) help from media services.

Remind the VRC to create export groups containing the union of
a. everything anne requests
b. everything chris requests
c. everything that answers to the description: European Art 1300-1800

Workflow with detailed instrctuions for using digital images in teaching *have printout need electronic version from Mary
projecting from the laptop in LC211 *have printout need electronic version
Intermediate photoshop top 5 tools *have printout need electronic version

Goetzmann redux

Attempt to write about the project, it’s origins, it’s status and it’s future use/development. I need to have a formal write up to present to CMI and Web Team. I just want to get everything down.
Financial Bonds Virtual Museum
William Goetzmann and Geerart Rouwenhurst

Origins of the project

Goetzmann’s collection of historical financial bonds were scanned and made into high resolution tiffs. He was trying to keep track of them using iPhoto, the large image size was creating a problem on his Mac. He was also teaching a course at the Bienecke on the history of finance and wanted to use the bonds in class for teaching as well as outside of class for review. He was working with the beinecke to digitize some of the materials such as lottery tickets that the b owns. Katherine A. was asked to advise what the best delivery method was for the digital images. She invited Gloria Hardman to speak about V2 and me to propose creating static galleries for posting on V2.

Our initial meeting made clear that the images were not in a format suitable for web viewing. Also there was no metadata associated with images except for the images that G had renamed. The filenames were meaningful. Upon further discussion it was made clear that searchable images, feedback and zoomable image files was preferred to the static site. I suggested Portfolio as a means for cataloging the images and NetPublish as a delivery method. The search feature as well as email feedback was possible.

Current Status:

To zoom the images, I downloaded the free version of Zoomify [] which takes large tiff files and creates many small jpegs that are delivered via a flash player. The tiffs were cropped and saved as tiffs and jpegs by Sarah Coe in the VRC. We found that using an external drive was much faster than burning a CD. Although, I did burn several DVD’s of the cropped tiffs for archival purposes. The Jpegs are in the top folder of the Goetzmann share on Media 4. * this should be changed to place these jpegs into a folder so that the top level remains easily readable, also the zoom folders should go behind a folder as well. Media 4 was chosen because it is open to the world and accessible by those outside of Yale. Goetzmann wants to share this resource with peer institutions. The zoom folders are at the top level as well and are linked to the detail button on the display template in NetPublish. The detail button calls to the detail template which has the flash player embedded on the page.

Both instructors have Portfolio as well as a SOM machine for student assistants to catalog. I’ve customized the search function to allow for searches that include custom fields requested by the faculty.

the website is at

Future enhancements/requests:

Users curate exhibits, creating gallery pages with thumbnails that link to the zoomed image file as well as annotations. These exhibits would be vetted by the faculty before posting.

Users can add images – this was less clear whether it would be done by the web interface. One needs large tiff files to create the zoom images. Faculty can provide funds to have these outside resources scanned and digitized in a format necessary for web delivery, archival and research purposes.

Possible annotation of the image using which allows users to draw a box on the image and writes information back to the jpeg header.

R and G will create some exhibits based on their book chapters. This will provide a best practices scenario for future curators

The future of the project

This resource will continue to be used in coursework. Students may do curating of online exibits as well. Donald Brows and Schuller will also use this resource.

My initial thoughts regarding this project. This project has a high profile aspect to it as well as the fact that the faculty has money to spend on development. Since the project will be used for coursework it might be worth it to keep the project within ITG to gain experience on creating user interactive galleries that allow for creation of online student presentations of media used in the course. However, because this will be a resource for those outside of Yale and possibly peer institutions such as Harvard, a group such as the CMI or Web Team who can provide more professional support in graphics and development should be considered.

One could see creating a comment field next to a thumbnail and delivering the links from netpublish to link to the zoomed image. netpbulish has the ability to create a shopping cart. A user shops, takes the items in the cart, creates a detail page that has a link to the zoom file. The detail page would have a comment field. Currently shopping carts work on a cookie but it is possible to code it so that it works on a session number which would require a database outside of portfolio to keep track of user and session.

There would need to be a gallery page with smaller thumbnails. Link to that gallery page from some main link, possibly adding gallery link to the buttons at the top near search, then the galleries get called in.

How I ripped YouTube video clips for a class

First I used a web page that gave me the address to the flv file — YouTube downloader. Then I used a downloader program (shareware) called Flashget. I’d enter that flv filepath and download it to my machine. Then I used Total converter to convert (in the case of the class clips) those flv files to a mp4 and I burned a DVD for use in a set top. Total converter is an interesting (cheap $49) piece of software that takes video formats and converts them to several -very usable- formats. There might be more efficient ways to rip a YouTube. Craig sent me this link for keepvid. For other streamed formats you can often see the asx file which will have the reference to the wmv or whatever file that the page is streaming, then you use the downloader to grab that media file.

Hmm, what are the copy right implications of such a venture?

Last minute scanning instructions from Mary

I copy and paste the bulk of an email from Mary regarding the present state of last minute scanning. The art history project has been taken offline. Attached you will find a zipped folder containing the web information that Mary had posted for art history last minute scanning. Last Minute scanning web pages and docs for art history – zipped files

Also attached are the Photo instructions for Mac and PC

Here’s the email:
just to reiterate what I said over the phone: for the fall, Last Minute Scanning (or whatever the Library chooses to call it) will be using the catalog on media4 named MS_photo (password for all levels = “photo”). The URL for the web-based interface is:

This is a password-protected site, since it is open outside of Yale. To enter, leave the Username field blank and type “xxxxx” in the Password field.

The Library will need to decide what additional metadata fields they wish to include and will need to customize the interface based on that choice of fields. At present, the only custom fields in the catalog are Requester ID, Job ID and Source. There is no particular reason for them to copy what I did in 06-07.

At present, the 31 “just for fun” images in this catalog are in the MS_photo folder on media4. The Library may prefer that Media Services put them directly on one of the Library’s servers. That is something for Karen to decide. I handed out very detailed instructions on how the folder structure within MS_photo (or whatever top-level directory is to receive the images) is supposed to work and also detailed instructions for folks in Media Services on how to upload image files and add them to the catalog so that the web-based interface works as expected. I’ve included the handouts as attachments to this e-mail so that you can see them.

Photo instructions Mac
Photo instructions Windows

As a rule (with exceptions), I did not add image requests from Last Minute Scanning to individual faculty catalogs at the time of request, but rather did two bulk exports per semester from Last Minute Scanning to faculty catalogs. The exceptions to this rule were: (1) Chris gets things added to his personal catalog at the time of request, and (2) anyone who requests a very large number of images gets them added at the time of request. It is up to the Library to set their own policy on this.


On first glance this is a web based editing interface that works like Contribute. It uses the Template Editable regions that one creates in a DW template based site and allows for editing of the flat files on the server. It uses the Innovastudio wysiwyg editor. It’s about $66 USD (49 euros) per site, discounts for volume licensing purchases. I think the name “content management system” isn’t true, it doesn’t save any files to a database. It could be a nice web-based alternative to Contribute. I’m not sure how it access the server, can you give a login for a particular site? Is it possible we could “embed” this into the webspace on a project site which would allow registered user to edit his own pages?

Here’s the link to MiniCMS

More to come…

Strebeigh English Web pages

I am meeting with Fred Strebeigh who creates web pages for the ENGL120, 454 and FES 400a courses, he’ll have 3 project sites on V2. I plan to set him up with contribute to edit these sites all on his own. Creating the sites from built in contribute templates.

Update: after meeting with Prof. Strebeigh for 3 hours…I determined that he was not a good candidate for contribute for the following reasons.

1. he’s very comfortable using word
2. the underlying concepts of html editing/uploading etc are not a part of knowledge but he’d like to understand it and create more attractive and robust web pages (we are still talking static pages). Contribute lends itself to the theory that what is under the covers you need not worry about at all, there is no editing the css nor is it easy to make navigation scheme that stays the same for several pages without a lot of editing.
3. He’s most likely to update pages offline and then upload them to the server
4. the web site must be closed to the outside world, not google-able (?) due to the fact that there are student papers for viewing

Contribute really only works well when online, the process is connect to the server, download from the server, make changes, upload to the server. The only way that will work on the webdav for V2 is if the site is viewable by anyone (i.e. open to the world).

So, the fall back was to let him stick with what he knows (word) and edit as he normally would then place those files on the webspace using webdav. However, Word creates terrible code, works well on his machine, even worked on old classes (sometimes, depending on if he got all the files and folders up to the server correctly, which he was doing using the multi file upload applet), and mainly worked if you viewed it via IE.

So, we move everything to the webspace and using webdav, determine that he will edit with Word, but wait, nothing is working right, not even some of the links that weren’t broken on V1. The reason? well, Word creates xml files that link to other files behind folders. Sometimes the link is corrupted as in there is a \ in the file path rather than a / AND webspace is a unix machine. One of his links points to Perry2007.htm when the file name is perry2007.htm, there is a case-sensitivity issue.

I could clean this up so the links at least work but it’s likely to get overwritten next time he edits.

My suggestion to him was to allow us (ITG/student help) to take a month or two to clean up these sites, add left-hand navigation, strip the awful word code, create a very simple css and take an hour to train him to use DW (only an hour).

I feel like he doesn’t really need a course website but this is the way he’s done it for years, the url is published in English Department booklets and he gives out the url to students for the summer etc. So maybe a web site makes sense. There is no need to even enter the v2 arena, however the same could be done using the v2 structure – these are after all just text files. The question remains, who has access, how to they sign up for a project site, v2 remains fairly locked down to those who are not registered and there is no easy way to give multiple users access to a project site without first determining who needs access and then adding them manually.

Gloria has made us project sites for these so the urls will remain persistent.

Art History Personal Web pages

Currently working with Rob Nelson, Chris Wood and Jacqueline Jung

Requested V2 project sites (using faculty last names) I will have maintain access. I’ll set up a webspace and connect to it with Contribute.

To Do items
Create a Blank Template
Move Wood current site to V2 and create connection
Check with Nelson to see if he has made updates to the site we started on VPN

Items to check on contribute
Can library items be edited (possible for making left navigation editable beyond using includes which do not work on V2)

Folder synch

This is useful if you have a series of nested folders where you have image files already categorized. The folder name and the filenames will be imported into the keyword field.

1. If there is no other copy of the nested folder structure and filenames, create a copy of the assets.
2. Create a local catalog. Make sure that under Advanced Cataloging options — Properties — “Create Keywords from Path” is selected and that Catalog Administration — “when adding Keywords….” is unchecked.
3. Click on the folder icon on the lower left
4. Navigate to the top level folder
5. select folder and click the sychronize icon. Click sync on the next dialog box
6. files will now be added to the catalog
7. Batch rename the files (this renames the original files) Use the naming convention + 4 to 5 digit numbers with leading zeros i.e. Patterson-00001
8. Export field data (filename and keyword only) File — export field data
9. Open field data txt file in excel and massage the entries (remove extension from keywords)
10. Move the original files to the server by selecting all items Item —original —copy and select the server and shared folder
11. Putting the images in a folder not at the top level is a good idea. It keeps the share tidy. * need to do that to the Goetzmann share