Wednesday, September 20, 2017

What you know vs. what you can prove

I had an interesting evening. A friend sent a link to a YouTube video, a recorded webinar for a "video analysis" product. I'll admit. I was curious. I watched it. Below is my commentary on what I saw.

The presenter outlined his workflow for working with video from a few different sources. The presentation turned to the difference between how Direct Show handles video playback vs. what's actually in the data stream. The presenter showed how Direct Show may not give you the best version of the data to work with. If you've been around LEVA for a while, you likely know this already. It's good information to know.

Then the presenter did a comparison of a corrected frame from the data stream vs. a frame from the video as played via the Direct Show codec - in Photoshop. He made an interesting statement that prompted me to write this post. He was clicking between the layers so that viewers could "see" that there was a difference between the two frames. The implication was that the viewer could clearly "see" the difference. He was making the point that one frame had more information / a more clear display of the information - illustrating it visually (demonstratively).

This got me thinking - does he know how people "see?"


I'll get to the difference between his workflow (using many tools) and Five (using one tool) in a bit. But first, I want to address his point about "clearly seeing."

I do not hide the fact that I am autistic. Thanks to the changes in the DSM, my original brain wiring diagnoses that were made during the reign of DSM IV now place me firmly on the autism spectrum in DSM V. Sensory Processing Disorder and Prosopagnosia have made life rather interesting; especially growing up in a time when doctors didn't understand these wiring issues at all. As an analyst, they present challenges. But they also present opportunities. Can a person doing manual facial comparison be accused of bias if they're face blind? Not sure. Never been asked. But I digress.

I've spent my academic life studying the sensory environment. My PhD dissertation focusses on the college sensory environment that is so hostile, autistic college students would rather drop out than stick it out. But again, I've studied and written extensively on the issue of what people perceive, so the presenter's statement struck me.

It also struck me from the standpoint of what we "know" vs. what we can prove.

The presenter took viewers on quite a tour of a hex editor, GraphStudio, his preferred "workflow engine," and Photoshop before making the statement that prompted this post. A lot of moving parts. Along the way, the story of how information is displayed and why it's important to "know" where differences can occur was driven home.

Yes, we can all agree that there are differences between how Direct Show shows things and how a raw look at the video shows things. It may be helpful to "see" this. But what if you don't perceive the world in the same way as the story teller.

Might there be another way to perform this experiment that doesn't rely on the viewer's perception matching that of the presenter?

Thankfully, with FIVE, there is.

The presenter started with Walmart (SN40) video being displayed via Direct Show. So, I'll start there too. SN40, via Direct Show, displays as 4 CIF.


Then, I used FIVE's conversion engine to copy the stream data into a fresh container.
It displays as 2 CIF.


I selected the same frame in each video and bookmarked them for export.



I brought these images back into FIVE for analysis.

The issue with 2 CIF is that, in general, every other line of resolution isn't actually recorded and needs to be restored via some valid and reliable process. FIVE's Line Doubling filter allows me to restore these lines. I can choose the interpolation method during this process. The presenter in the video chose a linear interpolation method to restore the lines (in Photoshop - not his "workflow engine"), so I did the same.


I've now restored the stream copy frame. I wanted not only to "see" the difference between frames ("what I know"), I wanted to compute the difference between frames ("what I can prove").

Again, staying in FIVE, I linked the the Direct Show frame with the Stream Copy frame with the Video Mixer (found in the Link filter group).


The filter settings for the Video Mixer contains three tabs. The first tab (Inputs) allows the user to choose which processing chains to link, and at what step in the chain.


The second tab (Blend) allows the user to choose what is done with these inputs. In our case, I want to Overlay the two images.


The third tab (Similarity) is where we transition from the demonstrative to the quantitative. Unlike Photoshop's Difference Blend Mode, FIVE doesn't just display a difference (is there a threshold where difference is present but not displayed by your monitor?) it computes similarity metrics.


With the Similarity Metrics enabled, FIVE computes the Sum of Absolute Difference (SAD), the Peak Signal to Noise Ratio (PSNR), Mean Structural Similarity, and the Correlation Coefficient. The actual difference, computed 4 different ways. You don't just "see" or "know" - you prove.


The reporting of this is done at the click of the mouse. FIVE has been keeping track of your processing and the results are complied and produced on demand - no typing your report. (My arthritic fingers thank the programmers each day.)


Reports in the PDF/a standard mean the greatest compatibility when dealing with customers. Click on the hyperlink on the report and read the explanation of what was done, the settings, and the academic/scientific source for the test. This means that FIVE's reports are fully compliant with ASTM's 2825-12. Are Photoshop's reports complaint? What about your "workflow engine? Hint, they are if you type them in such a way as to assure compliance. Who has time for that?

Total time for this experiment was under 5 minutes. I'm sure the presenter could have been faster than was displayed in the webinar, he was explaining things. But, he used a basked of tools - some free and some not free. He also didn't take the viewers time to compile an ASTM 2825-12 compliant report. Given the many tools used, I'm not sure how long that takes him to do.

When considering his proposed workflow, you need to consider the total cost of ownership of the whole basket as well as the cost of training on those tools. You also can factor how much time is spent/saved doing common tasks. I've noted before that prior to having FIVE, I could do about 6 cases per day. With FIVE, I could do about 30 per day. Given the amount of work in Los Angeles, this was huge.

For my test, I used one tool - Amped FIVE. I could do everything the presenter was doing in one tool, and more. I could move from the demonstrative to the quantitative - in the same tool.

Now, to be fair, I am retired now from police service and work full time for Amped Software. OK. But, the information presented here is reproducible. If you have FIVE and some Walmart video, you can do the same thing in this one amazing tool. Because I come from LE, I am always evaluating tools and workflows in terms of total cost of ownership. Money for training and tech is often hard to come by in government service and one wants the best value for the money spent. By this metric, Amped's tools and training offer the best value.

If you want more information about Amped's tools or training, head over to the web site.


Saturday, September 2, 2017

Changing times

I've been in the "video forensics" business for quite some time now. I've seen enough to notice trends in the industry. I've seen people come and go. Today, I want to comment on a coming trend that I believe will impact everyone in the business, LEOs and privateers alike.

Here's what I mean.

Going back to about 2006, the economy was booming and folks were happy. Then 2007 hit and the economy tanked. As belts tightened, people cut back on entertainment and other non-essential things. A result of this was major cut-backs in the movie business. Many out of work editors and producers entered the business of video forensics. They guessed that because of their knowledge of the tools - Avid MC, PremierePro, Final Cut, etc - they could go out there and compete for work, offering their services and "expertise" in video to the courts, attorneys, PIs, and the like. There were few success stories and a lot of colossal fails. Very few of these folks are still around.

Another trend is emerging.

In the push to assure future success, parents have been steering their kids to STEM degrees. Many have pursued and achieved doctorates in the STEM fields only to find that there is a glut of people on the market with such degrees (in my academic field, there's about a 600/1 ratio of applicants to jobs/grants). Some are leaving their degree field, using their expertise in experimental design and statistics (gained by every PhD) in a variety of useful ways (Think Moneyball).

A case* from Arizona last year serves as the canary in the video forensics coal mine. It's a firearms case, but all the issues can easily be applied to our field. In State v Romero (2016), the Arizona Supreme Court said that the trial court erred in not allowing the defense to call their "expert." The person in question wasn't a firearms examiner or a tool-mark examiner. He is an expert in Experimental Design, with a PhD in the discipline.

Here's some relevant parts of the ruling:

"...Dr. Haber was not offered to testify whether Powell had correctly analyzed the toolmarks on the shell casings. Instead, Dr. Haber, based on his expertise in the broader field of experimental design, criticized the scientific reliability of drawing conclusions by comparing tool marks."

"...Arizona Rule of Evidence 702 allows an expert witness to testify if, among other things, the witness is qualified and the expert’s “scientific, technical, or other specialized knowledge will help the trier of fact to understand the evidence . . . .” Trial courts serve as the “gatekeepers” of admissibility for expert testimony, with the aim of ensuring such testimony is reliable and helpful to the jury."

Hint, every state court and the US federal courts have a similar rule governing expert witnesses and their testimony.

"... The trial court here concluded that Dr. Haber was not qualified to testify as an expert in firearms identification. In affirming, the court of appeals noted that Dr. Haber, although having reviewed the literature on firearms identification, had not previously been retained as an expert on firearms identification, conducted a toolmark analysis, attempted to identify different firearms, or conducted research on firearms identification. 236 Ariz. at 458 ¶¶ 23-25, 341 P.3d at 500."

"... The issue, however, is not whether Dr. Haber was qualified as an expert in firearms identification, but instead whether he was qualified in the area of his proffered testimony — experimental design. Here, the trial court determined that Powell was qualified to offer an expert opinion that the shell casings were all fired from the same Glock. But Romero did not offer Dr. Haber as an expert in firearms identification to challenge whether Powell had correctly performed his analysis or formed his opinions. Instead, Dr. Haber’s testimony was proffered to help the jury understand how the methods used by firearms examiners in performing toolmark analysis differ from the scientific methods generally employed in designing experiments."

Did you catch that? Dr. Haber was retained to challenge the validity of the method used in the prosecution's examination - to illustrate "... how the methods used by firearms examiners in performing toolmark analysis differ from the scientific methods generally employed in designing experiments."

"... Under Rule 702, when one party offers an expert in a particular field (here, the State’s presentation of Powell as an expert in firearms identification) the opposing party is not restricted to challenging that expert by offering an expert from the same field or with the same qualifications. The trial court should not assess whether the opposing party’s expert is as qualified as — or more convincing than — the other expert. Instead, the court should consider whether the proffered expert is qualified and will offer reliable testimony that is helpful to the jury.  Cf. Bernstein, 237 Ariz. at 230 ¶ 18, 349 P.3d at 204 (noting that when the reliability of an expert’s opinion is a close question, the court should allow the jury to exercise its fact-finding function in assessing the weight and credibility of the evidence)."

"... The gist of Dr. Haber’s proffered testimony was that the methods generally used in conventional toolmark analysis fall short of scientific standards for experimental design. Dr. Haber’s testimony was therefore directed at the scientific weight that should be placed on the results of Powell’s tests. Such questions of weight are emphatically the province of the jury to determine. E.g., State v. Lehr, 201 Ariz. 509, 517 ¶¶ 24–29, 38 P.3d 1172, 1180 (2002). "

"... Apart from Dr. Haber’s qualifications, his testimony would not have been admissible unless it would have been helpful to the jury in understanding the evidence. Ariz. R. Evid. 702(a). The State presented Powell’s testimony that the indentations on shell casings demonstrated that the Glock had fired all the shells, including those at the murder scene, and the State argued that the toolmark comparisons demonstrated a match to “a reasonable degree of scientific certainty.” Dr. Haber’s testimony would have been helpful to the jury in understanding how the toolmark analysis differed from general scientific methods and in evaluating the accuracy of Powell’s conclusions regarding “scientific certainty.”"

"... The thrust of Dr. Haber’s testimony was that the methods underlying toolmark analysis (here comparing indentations and other marks on shell casings) are not based on the scientific method, but instead reflect subjective determinations by the examiner conducting the analysis. Haber would have explained that unlike experts who use other forms of forensic analysis rooted in the scientific method, firearms examiners do not follow an accepted sequential method for evaluating characteristics of fired shell casings and comparing them to control subjects. By describing the methods used by toolmark examiners, Dr. Haber’s testimony could have helped the jury assess how much weight to place on Powell’s “scientific” conclusion that the shell casings at the murder scene could only have been fired from the Glock found by the police when they stopped Romero." How big was the sample size in your experiment? How did you determine the appropriateness of that size? How did the casing's markings compare to a normal distribution of values derived from the sample / control subjects?

"... One of his critiques of the methodology used by firearms examiners is that they do not employ identifiable, standardized protocols." Show me the peer-reviewed, published source that describes the method used.

"... Dr. Haber’s testimony was intended to highlight that the conclusions drawn by firearms examiners from toolmarks do not result from the application of articulable standards and lack typical safeguards of the scientific method such as independent verification by other examiners. Thus, Dr. Haber’s testimony could have helped the jury to understand any eficiencies in the experimental design of toolmark analysis and to assess any suggestion that such analysis was “scientific.” Cf. Salazar-Mercado, 234 Ariz. at 594 ¶ 15, 325 P.3d at 1000 …" Who checked your work and signed-off on it? 

So why such a long post? I saw a video over on Deutsche Welle called "Crime fighting with video forensics." In it, the featured person made this statement: “each vehicle has a unique headlight spread pattern." Does it now? How does he know this? Did he conduct a study? Where is it published? Has he every been asked to prove out his methodology? What was the sample size of the experiment? How was the appropriate size for the sample calculated? How would his "headlight spread pattern" methodology stand up to cross-examination by an attorney prepared by someone with knowledge of experimental design? Remember, there are a lot of out-of-work PhDs out there? What would happen if Dr. Haber was the opposing expert in your case?

The Reddit Bureau of Investigation tackles the subject here. A link from that page contains the following quote, "... all the things your describing sound almost.... Imperfect? I mean, it scares me to think I might get pinned for a crime because I have a similar headlight spread as someone else … So what I'm asking is, are techniques like headlight spread and clothing identification taken very seriously in court? ..." According the the DW story, the matching of the "headlight spread pattern" did lead to a conviction in the highlighted case. The posts are about 5 years old. Plenty of time for someone to actually test this method and publish results - not just post questions on Reddit. But, I can't find any studies in the academic repositories.

Now, I may seem to be picking on one person. I'm not. I'm picking on the use of techniques that are called "science" but have no foundation in any science or the scientific method. I found police-led training on the subject with a simple Google search. Well-meaning folks will be exposed to this topic and begin to use it in their investigations - perhaps unaware of the challenges to it's validity that they may face if/when they testify as to their work.

Errors in conclusions and the use of untested methodologies threaten forensic science. It's not me saying this, it's the focus of the NAS Report. It's the reason the OSAC was created. If you're in the "video forensics" discipline, and you're giving your OPINION about something related to the evidence, PLEASE be sure that your opinion is grounded in valid and reliable science - science that you can quote when asked. For example, if you're using the Rule of Thirds to calculate the height of an unknown subject / object in a CCTV video, you will have problems under a capable cross examination. Where in academics / science can you find a paper that tells you how to employ this method for this purpose? Hint, you can't. If you're using Single View Metrology in your measurements, you'll easily find the source document for this technique as well as the many papers that cite this technique.

And this is where the weakness in many "analysts" work can be found. When giving your opinion, what is the source of your conclusion? Which paper? Which study? How about simply listing your references / sources in your report so there's no confusion as to the basis of your opinions?

My entry into grad school opened my eyes as to what I didn't know and what the various trade groups where I'd received my training couldn't prepare me for. My pathway to my dissertation had me laser focussed on stats, experimental design, sample sizes, validity, and defending my work in front of people who have gone down a similar path and know way more than me. It's humbling to defend one's work - to be cross-examined by such brilliant people. But, iron sharpens iron. I'm the better for it.

Rather than tell you, it'll be OK, I'm saying watch out. You're heading down an unsustainable path. If folks want to continue to use this method - "headlight spread pattern analysis," probability says that there's going to be a challenge. Do you want that to be you? Are you prepared for it?

Something to think about ...

*I'm not an attorney. This is not legal advice. This is not about one person or one case, but the use of untested / un-scientific techniques. Check your six. Relax. Breathe. Love.