AIVP Gregor Wilson

Wednesday, 12 December 2012

Week 10 - Video and Audio Practice

Week 10 Video Submission:

Using the original video files and music files:

Cut the selected video files to size and for 1 min deadline, arranging them so that there was a flow to the video.

Applied a crop effect and did some colourisation effects for a more cinematic effect.

I applied a fade transition to the start and end of the video, also adding in fade transitions to selected clips.

For the audio i added a reverb effect with a fade transition at the beginning and end.

Week 9 - Image Processing

An image capture system contains a lens and a detector. In digital photography the detector is often a charge coupled device (CCD), a linear or matrix array of photosensitive electronic elements.

Pixelization can be seen with the unaided eye if the sensor array is too low. Increasing the number of cells in the sensor array, increases the resolution of the image captured.

Before the light is collected by the lens is focuses on to the sensor array, it is passed through an optical low pass filter that serves to:

Exclude any picture data, which is beyond the sensor's resolution.

Compensate for false colouration

Reduce infrared and other non-visible light.

A pixel is the smallest digital element manipulated by image processing software.

Each pixel is individually coloured but since they are of finite size, pixel only approximate the actual colouring of a subject. Thus bit maps often show blocky areas or jagged lines under close examination.

Four common categorisation of DIP operations are analysis, manipulation, enhancement and transformation.

Analysis operations provide information on photometric features of an image e.g. colour count, histogram

Manipulation operations change the content of an image e.g. flood fill, crop

Enhancement operation attempt to improve the quality of an image in some sense e.g. heighten contrast, edge enhancement

Transformation operations alter the image geometry e.g. rotate, skew

Week 7 - Looking at Light

The light generated, transmitted and reflected by objects enables us to see, record and interpret the world around us as images, capturing: colours, shapes, details and textures.

The position of the sun throughout the day is the factor with the greatest impact on natural lighting, and has the power to transform the appearance of any subject.

The direction from which light strikes the subject is important. Subjects can be illuminated by; front lighting, side lighting and back lighting.

Front lighting occurs when the source is behind the observer directly on to the subject. Light covers the subject evenly revealing the detail.

Side lighting creates strong shadows, which emphasize texture in an image, giving a greater sense of shape, dimension and depth.

The most dramatic arrangements of light and shadow appear with back-lighting. In this case, the source of light is behind the subject, creating silhouettes and other interesting effects.

Friday, 26 October 2012

Week 6 - Audio Signal Processing

Applying the compression effect to the audio piece caused a reduce dynamic range, producing consistent volume levels and increasing perceived loudness.

Compression is particularly effective for voice-overs, because it helps the speaker stand out over musical soundtracks and background audio.

Some info I found on helping remember what each setting does:

Standard settings

Amount
Controls the level of compression.

Advanced settings

Threshold
Sets the input level at which compression begins. The best setting depends on audio content and style. To compress only extreme peaks and retain more dynamic range, try thresholds around 5 dB below the peak input level. To highly compress audio and greatly reduce dynamic range, try settings around 15 dB below the peak input level.
Ratio
Sets a compression ratio between 1‑to‑1 and 30‑to‑1. For example, a setting of 3 outputs 1 dB for every 3-dB increase above the threshold. Typical settings range from 2 to 5; higher settings produce the extremely compressed sound often heard in pop music.
Attack
Determines how quickly compression starts after audio exceeds the Threshold setting. The default, 10 milliseconds, works well for a wide range of source material. Use faster settings only for audio with quick transients, such as percussion recordings.
Release
Determines how quickly compression stops when audio drops below the Threshold setting. The default, 100 milliseconds, works well for a wide range of audio. Try faster settings for audio with fast transients, and slower settings for less percussive audio.
Output Gain
Boosts or cuts amplitude after compression. Possible values range from ‑30 dB to +30 dB, where 0 is unity gain.

Ambient Sound and Reverb:

The 'reverb' you hear on records is sometimes the result of ambient miking techniques, rather than artificial reverb. Opening the doors of a studio and placing ambient mics in hallways or adjacent spaces, as described by producer Ben Hillier, seems to be a popular technique.

Friday, 19 October 2012

The Ear and Hearing

The Outer Ear

The external ear (pinna) and the ear canal act as a guide, directing sound waves that vibrate the ear drum. The acoustic transfer functions formed by the shape, size and position of the head and pinna also provide information on the relative direction of the sound source. The roughly horn shape of the pinna and ear canal create an acoustic transfer function which is directionally sensitive and also amplifies sound at certain frequencies e.g. 10 dB to 20 dB gain around 2.5 kHz.

The Middle Ear
The vibration of the eardrum is transferred through the chain of small bones (ossicles) in the middle ear that in turn vibrate the oval window of the cochlea. This impedance matching chain of transmission acts to reduce power loss as the air-borne vibrations are transferred to the fluid medium in the cochlea. Excessive movement of the ossicles is constrained by a neuro-muscular feedback mechanism that acts to prevent damage due to loud sounds.

The Inner Ear

Vibration of the oval window produces pressure waves in the cochlear fluid that stimulate cochlear structures that perform a spectral analysis. Sensor "hair" cells within the cochlea cause neurons connected to the auditory nerve to "fire”. They transmit timing, amplitude and frequency information to the auditory brain stem where a hierarchy of neural processing commences.

Human Hearing and Speech
Human hearing covers a range of frequencies from about 20 Hz to 20 kHz and can respond to an enormous range of sound levels.
Within this range are those frequencies and levels generated by conversational speech. A rough guide to the separation in frequency and level between vowel sounds and consonants is shown.
The lowest sound pressure level (SPL) that humans can hear varies with frequency and is called the Hearing Threshold.
Greatest sensitivity is normally in the frequency range 1 kHz to 4 kHz.

MPEG/MP3 Audio Coding

The use in MP3 of a lossy compression algorithm is designed to greatly reduce the amount of data required to represent the audio recording and still sound like a faithful reproduction of the original uncompressed audio for most listeners. An MP3 file that is created using the setting of 128 kbit/s will result in a file that is about 11 times smaller than the CD file created from the original audio source. An MP3 file can also be constructed at higher or lower bit rates, with higher or lower resulting quality.

The compression works by reducing accuracy of certain parts of sound that are considered to be beyond the auditory resolution ability of most people. This method is commonly referred to as perceptual coding. It uses psychoacoustic models to discard or reduce precision of components less audible to human hearing, and then records the remaining information in an efficient manner.

Friday, 12 October 2012

Digital Signal Processing - Lab 2 Video

A Typical Digital Signal Processing System

Signal processing is an area of systems engineering, electrical engineering and applied mathematics that deals with operations on or analysis of signals, in either discrete or continuous time, to perform useful operations on those signals. Signals of interest can include sound, images. Signals are analog or digital electrical representations of time-varying or spatial-varying physical quantities.

Electronic filters are electronic circuits which perform signal processing functions, specifically to remove unwanted frequency components from the signal, to enhance wanted ones, or both. Electronic filters can be:

1) passive or active

A passive component, depending on field, may be either a component that consumes (but does not produce) energy, or a component that is incapable of power gain .

An active filter is a type of analog electronic filter, distinguished by the use of one or more active components i.e. voltage amplifiers or buffer amplifiers. Typically this will be a vacuum tube, or solid-state (transistor or operational amplifier).

2) analog or digital

3) high-pass, low-pass, bandpass, band-reject (band reject; notch), or all-pass.

A high-pass filter (HPF) is a device that passes high frequencies and attenuates (i.e., reduces the amplitude of) frequencies lower than its cutoff frequency. A high-pass filter is usually modeled as a linear time-invariant system.
High-pass filters have many uses, such as blocking DC from circuitry sensitive to non-zero average voltages or RF devices. They can also be used in conjunction with a low-pass filter to make a bandpass filter. The actual amount of attenuation for each frequency is a design parameter of the filter.
A low-pass filter is an electronic filter that passes low-frequency signals but attenuates (reduces the amplitude of) signals with frequencies higher than the cutoff frequency. The actual amount of attenuation for each frequency varies from filter to filter. It is sometimes called a high-cut filter, or treble cut filter when used in audio applications. A low-pass filter is the opposite of a high-pass filter.
A band-pass filter is a combination of a low-pass and a high-pass.
Low-pass filters exist in many different forms, including electronic circuits (such as a hiss filter used in audio), anti-aliasing filters for conditioning signals prior to analog-to-digital conversion, digital filters for smoothing sets of data, acoustic barriers, blurring of images, and so on. Low-pass filters provide a smoother form of a signal, removing the short-term fluctuations, and leaving the longer-term trend.
In signal processing, a band-stop filter or band-rejection filter is a filter that passes most frequencies unaltered, but attenuates those in a specific range to very low levels. It is the opposite of a band-pass filter. A notch filter is a band-stop filter with a narrow stopband (high Q factor).

4) discrete-time (sampled) or continuous-time

5) linear or non-linear

6) infinite impulse response (IIR type) or finite impulse response (FIR type)
Infinite impulse response (IIR) is a property of signal processing systems. Systems with this property are known as IIR systems or, when dealing with filter systems, as IIR filters. IIR systems have an impulse response function that is non-zero over an infinite length of time. This is in contrast to finite impulse response (FIR) filters, which have fixed-duration impulse responses. The simplest analog IIR filter is an RC filter made up of a single resistor (R) feeding into a node shared with a single capacitor (C). This filter has an exponential impulse response characterized by an RC time constant.
IIR filters may be implemented as either analog or digital filters. In digital IIR filters, the output feedback is immediately apparent in the equations defining the output.

The most common types of electronic filters are linear filters, regardless of other aspects of their design. See the article on linear filters for details on their design and analysis.

Pitch is an auditory perceptual property that allows the ordering of sounds on a frequency-related scale. Pitches are compared as "higher" and "lower" in the sense associated with musical melodies, which require "sound whose frequency is clear and stable enough to be heard as not noise". Pitch is a major auditory attribute of musical tones, along with duration, loudness, and timbre.

DSP System Needs

Input and output filtering
Analogue to digital, and digital to analogue conversion
Digital Processing Unit

Why Use Digital Processing?

1) Precision - In theory the precision of Digital Signal Processing systems is limited only by the conversion process at input and output.
In practice, sampling rate (sampling frequency) and word length restrictions (number of bits) modify this.
However the increasing operating speed and word length of modern digital logic is allowing many more areas of application.

2) Robustness - Due to logic noise margins, digital systems are inherently less susceptible than analogue systems to: a) electrical noise and b) component tolerance variations.
Adjustments for electrical drift and component ageing are essentially removed; importnat for complex system.

3) Flexibility - Programmability allows upgrading and expansion of the processing operations, without necessarily incurring large scale hardware changes. Practical systems with desired Time Varying and/or Adaptive characteristics can be constructed.

Simple Sound Card Architecture

Sampling a Signal

Friday, 5 October 2012

Week 2 - Getting familiar with waves

Q1 In a recording room an acoustic wave was measured to have a frequency of 1KHz. What would its wavelength in cm be?

Answer: The sound will be traveling through air so the velocity of sound will be 340m/s. To get the wavelength, the velocity 340m/s needs to be divided by the frequency of the wave which is 1000Hz. This gives the answer of 34cm.

Q2 If an acoustic wave is traveling along a work bench has a wavelength of 3.33m what will its frequency be? Why do you suppose that is it easier for this type if wave to be travel through solid materials?

Answer: The velocity of the sound through the bench is about 4000m/s. To get the frequency, the velocity 4000m/s needs to be divided by the wavelength of 3.33m. Its frequency then is 1.2KHz.

Q3 Research the topic “Standing Waves”. Write a detailed note explaining the term and give an example of this that occurs in real life. (Where possible draw diagrams and describe what represent)

Answer: A standing wave is a wave that resonates up and down but does not actually move. An example of this in real life would be the string on a guitar. When you pluck it, the string moves up and down but it is not travelling along the guitar.

Q4 What is meant by terms constructive and destructive interference?

Answer: Constructive interference occurs when the crest of one wave overlaps the crest of another to combine and produce a wave of increased amplitude, whereas destructive interference occurs when the crest of one wave overlaps the trough of another and no change occurs.

Q5 What aspect of an acoustic wave determines its loudness?

Answer: The higher the amplitude the louder the sound will be.

Q6 Why are decibels used in the measurement of relative loudness of acoustics waves?

Answer: Decibels are used because it is a logarithmic measurement that reflects the range of sound intensity our ears can perceive and closely correlates to the function of our ears and our perception of loudness.

Q7 Does sound travel under water? If so what effect does the water have?

Answer: Sound does travel under water. Sound in water travels faster than in air because water particles are closer together than in air so the vibrations will happen quicker thus the sound will travel faster.

Friday, 28 September 2012

Sound Waves - Lab Video 1

Sound Waves

Sound is transmitted as wave motion through a medium such as air, water or metal.

Waves are divided into types according to the direction of displacement of the medium in relation to the direction of the motion of the wave itself.

Two basic classifications are transverse and longitudinal waves.

Transverse Waves

An example of a transverse wave is the ripples on the surface of water. The vibrations of the water molecules are at right angles (up and down) to the direction of motion (out from the disturbance).

Longitudinal Waves

If the vibration is parallel to the direction of motion, as it is with sound, the wave is known as a longitudinal wave. As the sound wave is propagated outward from the centre of disturbance, the individual air molecules move back and forth, parallel to the direction of the wave motion.
Each individual molecule passes the energy on to neighbouring molecules, but after the sound wave has passed, each molecule remains in about the same location in space.

Compression and Rarefaction

Thus, a sound wave is a series of alternate compression (increase in density) and rarefaction (decrease in density) events in the medium e.g. air.

Wavelength and Amplitude

For a transverse wave, the wavelength is the distance between two successive crests or troughs. For longitudinal waves, it is the shortest distance between two peak compressions.

Velocity

Sound in steel moves at a speed of just under 5000 m/sec. (For example, faster than the speed of a bullet)
The speed of sound in water, is roughly 1500 m/sec (For example, faster than a fighter jet) at ordinary temperatures but increases greatly with an increase in temperature.
The speed of sound in the air around us is roughly 1/3 km/s i.e. 333m/s
Thus to travel the 3m lenght of an average living room sound will take t=3/333s =0.009s

Frequency, Velocity and Wavelength

The frequency of the wave is the number of vibrations per second. The unit is Hertz (Hz).
The velocity of the wave, which is the speed at which it advances, is equal to the wavelength times the frequency.
Velocity = wavelength x frequency

Standing Waves

Standing waves disturb, but do not travel through the transmission medium. They are present e.g. in the vibration strings of musical instruments and other places.
A violin string when bowed or plucked, vibrates as a whole with a node (minimum) at each end and an anti-node (maximum) in the middle.
It also vibrates in halves, with a node at the centre, in thirds and in various other fractions, all simultaneously.
Such vibrational modes also occur within cavities e.g. a room or the bore of a flute. A room can support a standing wave with a node at each pair of opposing walls.

Harmonics

The vibration as a whole produces the fundamental tone, and the other vibrations produce the various tones.
A harmonic is an integer multiple of the fundamental frequency e.g. 2x fundamental, 3x fundamental, etc.
So a sound consisting of components at frequencies of 1000Hz and 3000Hz would contain a 3rd harmonic of the 1 kHz fundamental.

Amplitude

The amplitude of a sound wave is the degree of motion of air molecules within the wave, which corresponds to the extent of rarefaction and compression that accompanies the wave.
The greater the amplitude of the wave, the harder the molecules strike the ear drum or microphone diaphragm and the louder the sound that is transduced.
The amplitude of a sound wave can be expressed in absolute units by measuring the actual distance moved by the air molecules, or the pressure difference in the compression and rarefaction, or the energy involved.
Ordinary speech, for example, produces sound energy at a power level of about one hundred-thousandth of a watt. (Very very small amount of power)

Sound Intensity & Level

These measurements are extremely difficult to make, so the intensity of sounds is generally expressed as an equivalent sound level.

Normal atmospheric pressure is 100,000 Pa.

Sound Level and the Decibel

The sounds intensity of normal conversational speech is around 100,000 times that of whispered speech.
So that we can conveniently discuss and graph such a huge range of values, sound intensity level is defined using a logarithm and is measured in decibels dB.

The Inverse-Square Law

The intensity of the sound received varies inversely as the square of the distance R from the source.
In open air, sound will be roughly nine times less intense at a distance of 3m from its origin, as at a distance of 1m.

Echos and Reverberation

An echo is the perceived reflection of sound from a surface. The fraction of sound level reflected is known as the reflection coefficient. The time difference between the echo and the direct sound depends on the distances traveled and the speed of sound. The difference must be greater than about 100ms to be perceived as an echo.

Spectrum

Since many sounds contain various frequency components it is often useful to display a sound spectrum, that is a graph of sound level against frequency over a short period of time.

Spectogram

The variation of intensity with time and frequency can be displayed as a Spectogram by representing intensity by colour or brightness on a Frequency vs Time axis.

Useful links:

http://www.mediacollege.com/audio/01/sound-waves.html
http://www.mediacollege.com/audio/01/wave-properties.html
http://www.mediacollege.com/audio/01/wave-interaction.html
http://www.mediacollege.com/audio/01/sound-systems.html
http://science.howstuffworks.com/sound-info.htm