The majority of signals that you will model as random have multiple values, that is, they are multivariate. Examples include multiple samples of a time signal and signals measured at different sensors. In this lesson you will learn the tools used to jointly characterize the random behavior of multiple signal components. You will learn how to interpret the covariance matrix, which is the most widely used characterization for multivariate random signals. This foundation will position you to understand and properly use many signal processing techniques.

## Prerequisites

*[Content protected for Mastery, Pro, Professional members only. Please login or upgrade your membership to see this content.]*

Peter A says

Hello Prof. Barry,

You mention that x could be a vector of N samples in time or samples from N sensors. In the case of N sensors, is it assumed that you take a slice in time (or frequency if in frequency domain), or do x1 x2.. represent arrays themselves?

In other words, what does the notation x(jw) in many books mean in this context?(x is a vector and w = omega)

Thank you,

Peter

Barry Van Veen says

There are a lot of possibilities for defining x, including the cases you mention: 1) N samples in time, 2) N different sensor outputs at a particular time (or frequency), 3) samples in time from multiple sensors. This last case may be defined as you suggest, by letting x1, x2, etc be arrays.

The covariance matrix is defined the same way for any of these cases. However, the structure in the covariance matrix will change depending on the relationship between the entries in x. If x consists of time samples from a wide-sense stationary random process, then the covariance matrix will be Toeplitz, or constant along diagonals.

Typically x(jw) refers to the Fourier transform of a (possibly vector) signal x(t). For example if x(t) is a vector of time series collected at multiple sensors, then x(jw) contains the Fourier transform of each entry of x(t). This notation is typically limited to deterministic (non random) signals. It is not proper to use x(jw) to mean the Fourier transform when x(t) is a random signal as the Fourier integral does not converge for a random signal.

Peter A says

Thank you! I think I understand the nomenclature now.

Regarding the use of x(jw) for random signals, what about noise? What if an algorithm calls for using the PSD matrix of the noise. Is it proper to estimate it using pwelch or is multiplying the vectors E[x(jw)x^H(jw)] preferable?

Thank you.

Barry Van Veen says

Using the expectation E{x(jw)x^H(jw)} requires the probability density of the noise, since it is defined as the integral of x(jw)x^H(jw) with respect to the probability density. If you know the probability density function, then you should use it as that will give an exact value. However, it is very unusual to have the probability density function - usually one only has data. In this case one needs to approximate the expectation using some sort of average. pwelch performs an average by breaking the data up into segments and then averaging over the segments. If you have noise observations, then pwelch is one way to estimate the PSD matrix.

If you have signal plus noise (don't have observations of the noise by itself), then you have to account for the presence of the signal when estimating the PSD of the noise. This is a bit more complex. One approach that may work is to use maximum likelihood (ML) estimation. I have several examples at https://allsignalprocessing.com/maximum-likelihood-estimation-examples/. ML works well if the PSD has known structure, e.g., the PSD corresponds to uncorrelated, equal variance data so that E{x(jw)x^H(jw)} = where is unknown. In this case estimating usually involves averaging over samples of the elements of x(jw), where x(jw) is evaluated from the data using the DFT.