Featured Post

Welcome to the Forensic Multimedia Analysis blog (formerly the Forensic Photoshop blog). With the latest developments in the analysis of m...

Wednesday, July 31, 2019

The four dimensions and image / video analysis

When someone mentions "different dimensions," we tend to think of things like parallel universes – alternate realities that exist parallel to our own, but where things work or happened differently. However, there is no FRINGE Division defending us against intruders from parallel worlds. The reality of dimensions and how they play a role in the ordering of our Universe is really quite different from this popular characterization. Understanding how they work is foundational to our work as digital / multimedia analysts.
To break it down, dimensions are simply the different facets of what we perceive to be reality (source). We are immediately aware of the three dimensions that surround us on a daily basis – those that define the length, width, and depth of all objects in our universes (the x, y, and z axes, respectively).

The first dimension, as already noted, is that which gives it length (aka. the x-axis). A good description of a one-dimensional object is a straight line, which exists only in terms of length and has no other discernible qualities.
Add to it a second dimension, the y-axis (or height), and you get an object that becomes a 2-dimensional shape (like a square).
The third dimension involves depth (the z-axis), and gives all objects a sense of area and a cross-section. The perfect example of this is a cube, which exists in three dimensions and has a length, width, depth, and hence volume.
Scientists believe that the fourth dimension is time, which governs the properties of all known matter at any given point. Along with the three other dimensions, knowing an object's position in time is essential to plotting its position in the universe.
Video and images fit well within this concept. But, it's important to understand what one is looking at when one analyzes multimedia files.

You see, multimedia takes the 4-dimensional world and records it in a 2-dimensional medium. Images and video are flat - X and Y only. There is an element of time as well. But, the third dimension is treated differently. The third dimension gets skewed a bit, causing a perspective effect.
Perspective, in this case, is an approximate representation of an image as it is seen by the eye and processed by the brain. The two most characteristic features of perspective are that objects appear smaller as their distance from the observer increases; and that they are subject to foreshortening, meaning that an object's dimensions along the line of sight appear shorter than its dimensions across the line of sight.

Because of this effect, it's important to master a few concepts as well as to have a valid toolset when working in this space.

Conceptually, Nominal Resolution is the numerical value of pixels per inch as opposed to the achievable resolution of the imaging device. In the case of digital cameras, this refers to the number of pixels of the camera sensor divided by the corresponding vertical and horizontal dimension of the area photographed. (SWGDE Digital & Multimedia Evidence Glossary, Version 3.0) In video and image, the farther away from the camera one gets, or the deeper into the scene one gets, the lower the nominal resolution becomes. At a certain point, nominal resolution moves from pixels per unit of measure to unit of measure per pixel (e.g. 2cm / px vs 2px / cm).

The problem with validity in measurements in this discipline is that the majority of freeware, even Photoshop, treats every pixel the same. Basic planar geometry says that a pixel equals a real world measure no matter where in the image you measure. But, with depth / perspective, we know this can't be the case. Thus, we need a valid toolset for Single Image Photogrammetry.

Single Image Photogrammetry uses elements within the image itself to estimate the measure of unknown objects / subjects (e.g. a doorway's known height informs the measure of a person who walks by / through). Single Image Photogrammetry is my preferred method of photogrammetry as it employs only the evidence item and does not require the creation of additional files.

What do I mean by this - creation of additional files?

When utilizing reverse projection, for example, you must create a brand new recording. Assuming that you use the same recorder and camera/lens that was used in the creation of the evidence file (and that their settings remain unchanged from the time of the original recording), this new piece of evidence can be associated with the evidence file with a simple overlay. Similarity metrics can be employed to verify that the camera/lens position/settings haven't changed.

BUT... you must understand what reverse projection is from an evidentiary standpoint. You're creating a demonstrative exhibit when you engage in reverse projection (you must adequately explain of what your exhibit is demonstrative). You are creating a piece of evidence to demonstrate a single theory of the case. Thus, multiple reverse projection exhibits would be required in order to satisfy Daubert's requirement to account for multiple theories. It's also important to know that reverse projection alone is not measurement. It's an overlay. Because of compression and other errors, there will be a range of values possible for your eventual measure - not a single value. Reverse projection can assist in a follow-up measurement, such as Single View Metrology. Thus, the demonstrative (reverse projection) combined with a valid measure become reverse projection photogrammetry.

With the three dimensions properly accounted for, it's time to address the fourth dimension (pun intended). As noted in previous posts, time information extracted from multimedia containers is not "ground truth." It can't be assumed to be accurate. These devices are nothing more than a $7 box of parts. Thus, in order to attempt to link the timing information in the container to previous events, a valid experiment must be run.

All of this is to say, measuring objects / subjects within in evidence footage is complex and requires a trained / experienced analyst employing a valid toolset. If you'd like to continue this discussion and move beyond simple vendor training (which buttons do what) and into the world of science and experimentation, we'd love to see you in one of our classes.

Friday, July 26, 2019

SWGDE Practical Considerations for Submission and Presentation of Multimedia Evidence in Court - initial draft for public comment

Today's post deals with SWGDE Practical Considerations for Submission and Presentation of Multimedia Evidence in Court, the initial draft for public comment. My comments on this guide deals more with foundation than specific technical details. Let's take a look.

In Section 4, you'll find this guidance, "This should also include documentation of any persons contacted in relation to the evidence." Wow!

Consider whom you may have contacted?

  • Your coworkers.
  • Your supervisor.
  • Others in your chain of command.
  • Others outside of your chain of command.
  • The FVA List.
If you sent a query to the FVA list, did you turn over a copy of your correspondence to the process in discovery? But more fundamentally, if you sent a query to the FVA list, did you have specific permission to do so. So many list members share copies of evidence on this relatively un-secure platform. If members of the FVA List shared opinions, did those opinions make it to discovery? Did you properly account for their opinions / their work if their opinions / work differs from yours?


In Section 5.1, there are a number of points that need to be considered.

First,"Has the proffered evidence been properly authenticated? F.R.E. 901, 902."

Wow again. Remember, "authentication" is different than "hash." Authentication deals with accuracy in context, not integrity in transport.

Also remember that the majority of programs used these days do not actually interact with the evidence files but create a proxy file via FFmpeg. This is certainly true of the major players in this space, Input-Ace and Amped FIVE.

I remember, when working for Amped Software, Inc., that many customers would contact us because FIVE's "conversion" of files (the creation of a proxy file) would result in a different frame count vs. the original data file. Naturally, the resulting file would not hash the same. But, testing the contextual authenticity is what FRE 901 & 902 is all about. Was there a material change in context that resulted from the creation of the "working copy?"

You wouldn't know if you didn't conduct even a basic authenticity exam.

Remember, contextual authentication is not a feature of LEVA's CFVT / CFVA programs. Neither is it a feature of the IAI's CFVE program. It is, however, one of the tested domains in the AVFA certification (see this post on these programs).

There are precious few people trained in contextual authentication or enabled by a valid and reliable tool set. If you're interested, please come to one of our upcoming training sessions or take our on-line authentication course.

Included in the discussion of contextual authenticity is the next statement in the document, "Is the proffered evidence an original or an accurate reproduction of the original? F.R.E. 1002 and F.R.E. 1003."

Again, how would you know if you don't test? How do you test if you don't know how to conduct authenticity exams? What tools do you use? What does your experimental design look like?

The FRE, in 901(a) deals with "true and exact" or "bitstream" copies. These copies will hash the same - copy vs. original. But, when you create a proxy / working file, and there are errors in the process, the hash values will be different.

"Authentication testimony may include:
The retrieval method.
The condition of the original recording device and the accuracy of the resultant multimedia.
Time offsets and other observations noted during the retrieval.
Agency evidence and storage protocols.
Chain of custody documentation."

The retrieval method should be documented.

  • Did you retrieve it? If not, who did? Are they on the witness list?
  • Are all of the device settings noted?
  • Is the complete signal path - lens to hard drive - accounted for?
"The condition of the original recording device and the accuracy of the resultant multimedia" speaks to ground truth. How do you know that the resultant multimedia file is an accurate representation of events. How would you know? You test. Did you test? Did you disclose your test design / reports / results?


By placing the FRE notes in the document, but giving them a "drive-by" treatment, they're setting up the typical examiner for problems. Why? The audience for this document is not the examiner, per se. It's trial support technicians and attorneys. They're pointing out an obvious problem in the system - that contextual authentication must happen - without noting that so few examiners have the knowledge, skills, experience in this vital process.

Don't get me wrong, I'm not advocating for the elimination of authenticity exams. Quite the opposite, I'm one of the biggest proponents. What I'm saying here is that the document should do only one thing - render marginal technical advice on the "display" of evidence at trial. Leave the legal advice out completely. SWGDE is a loose collection of practitioners, not lawyers.

As always, if you have any issues with what you've read, please leave your (polite) comments below.

Have a good weekend my friends.

Thursday, July 25, 2019

SWGDE Technical Notes on FFmpeg v2.0

As noted yesterday, I want to take a moment to examine a current document, SWGDE Technical Notes on FFmpeg v2.0.

I have a couple of big problems with this document. But first, what is FFmpeg?

FFmpeg is a freeware tool. It's also a tool that underpins many of the big platforms used in digital / multimedia analysis. But, what is important to know is that FFmpeg is not a tool of "forensic science." I've had concerns with the forensic science use of FFmpeg for some time. You can visit these posts to get caught up: 5/19/2014, 12/18/2018, & 3/23/2019 are a few key posts.

Point 1, there's a section missing in this document.

There is a need for a section describing what FFMPEG does, as a program, when data is missing or non-standard in reporting essential fields for playback. If yes, report. If null, then (x). 

As an example, the frames per second field / data entry. There are many Chinese manufacturers that feature a non-standard or empty tag for this field that is interpreted within the DVR for playback via other programatic information. In this case, FFMPEG (as a European product), may report the fps as 25 because there needs to be a value for fps in order to play the video. If the retrieval report notes that the recorder is expecting to place 30fps into the container, and the result of FFMPEG's work is 25 fps because of non-standard tag use, then the examiner should at least be aware that (a) this can/does happen, (b) how to address the results from a scientific standpoint (what it "is" vs. what is "should be").

This basic limitation in the data should be known by those that use this tool. In many cases, the practitioner takes what is output at face value, without further enquiry. Thus, a bit of a warning is necessary that this situation exists and that there are steps that can be taken to either ascertain the correct value or list it in a limitations section of the report as an unknown variable.

Point 2, the output of the Macroblock Analysis function is incorrectly described.

Section 8.2 graphic (see below) incorrectly describes the Grey output block.
Grey doesn't "generally" mean “no change.” Grey means that the block does not meet a defined condition. From the source, "Note that these are parsed in order (from top to bottom), and if one condition matches, the color will be chosen. If none matches, the macroblock will be rendered as grey.” As they are parsed in order, the display order of the graphic should follow the guidance from FFMPEG.

Here's what FFmpeg's version of the flow chart looks like.
Not only is Grey incorrectly described, the description of the conditions represented by the various colours is incomplete.

The first line of SWGDE's chart shows purple as "new data." That colour in FFmpeg's chart is noted as 16x16 Intra Prediction. "New data" can be interpreted in several ways. But, "Intra Prediction" makes it quite clear that the block represents predicted data taken in some way from that frame.

A similar problem exists with the next line. Again, the "new data" description isn't accurate. It actually represents 4x4 Intra Prediction.

You'll also note that there are more conditions present in the latter chart (FFmpeg) than the former (SWGDE). Why should this be?

Thus it is that the information in the current document is incomplete and inaccurate. I have no idea why my suggestions weren't addressed during the revision. I received no communication from SWGDE - either acknowledging receipt of my correspondence or rejecting my suggestions. I'll address that issue in a later post.

As always, if you care to comment, please do so (politely) below.

Be well my friends.

Wednesday, July 24, 2019

SWGDE Core Technical Concepts for Time-Based Analysis of Digital Video Files - draft for public comment

Continuing on from yesterday's post, let's examine SWGDE Core Technical Concepts for Time-Based Analysis of Digital Video Files. I'd like to share with you, the reader, what I shared with the SWGDE Secretary.

My primary concern with this document is the irrational switch from “what is” (e.g., the data in the container), to “what it should be” (e.g., the data’s relationship to previous events). As a document for examining the data in the container, this document is a good treatment of the relevant time code standards. As a document for attempting to link the data in the container to a previous event, it is quite lacking in scientific foundation. I will illustrate my points below.

Page 4 - Current Text:

This document does not address the process by which images are captured, sampled, and/or encoded; it focuses on the interpretation of data once it has been encoded into a binary format.

Issue:

The interpretation of the data, the procedure necessary to attempt to link the data in the container to a previous event, is not actually described in this document. The document assumes that what is in the container is “ground truth,” but that cannot be assumed and must be established through testing. Thus, the correct word of the focus is “reporting,” and not “interpretation.” Nothing in this document could be used to inform a conclusion that is founded in experimental science. Conclusions, opinion based testimony, are the results of analysis / experimentation, which is not the focus of this guide.

Page 4 - Current Text:

Determining the frame timing within a video file has several applications and may be particularly helpful in determining the accuracy of an unknown variable during an event of interest."

Issue:

The document describes the various time code types that may be present in a data container. Finding and reporting this data with valid tools is not a “determination” in the scientific sense, but a reporting of what is contained in the output of specific reporting processes.

Thus, the correct verb is “report,” and not “determine." Determinations, (aka conclusions), are opinion based and are thus the results of analysis / experimentation, which is not the focus of this guide.

Current Text:

Digital video containers and encoding formats define methods to encode timing information within binary streams or packages. Proper decoding of timing information is critical for the ability of software to provide accurate playback of digital video.

Issue:

A “proper decoding of timing information” assumes that a “proper” encoding of the information happened in the creation of the evidence file. The evidence file is a sample of 1, and until / unless a baseline or ground truth of the range of timing behavior of the recording device is established through a valid and reliable experiment, there is no way of knowing what is “proper.” If one seeks to link the timing of the video playback to previous events, one must build a performance model of the capture device. Given the nature of DVR manufacturing and the Just in Time manufacturing model that the majority of the DVR manufacturers employ, this procedure can not be generalized from the results of tests of a single DVR, rather each DVR’s performance must be modeled.

Experimental Design

It’s a fundamental principle of experimental science that one can only measure “now.” To attempt to link “now” to “then,” in any direction of time, one must predict by designing, testing, validating, and implementing a prediction model. This is certainly possible to do with DVRs and frame timing via multiple logistic regression. In this way, all of the variables can be controlled and a range of values computed.

If you were conducting research and comfortable with an error probability of .05 on both ends (Type 1 / Type 2), then the protocol for the sample size calculation would look like that shown below (controlling only for the number of camera views and not for the various data choke points, and etc):

t tests - Linear multiple regression: Fixed model, single regression coefficient

Analysis: A priori: Compute required sample size
Input: Tail(s)                       = Two
Effect size f²                 = 0.15
α err prob                     = 0.05
Power (1-β err prob)           = 0.95
Number of predictors           = 16
Output: Noncentrality parameter δ     = 3.6742346
Critical t                     = 1.9929971
Df                             = 73
Total sample size             = 90
Actual power                   = 0.9520952

If you were conducting this test in a criminal justice proceeding, the error probability should be lower: .01 on both ends (Type 1 / Type 2). The protocol for the sample size calculation would look like that shown below (controlling only for the number of camera views and not for the various data choke points, and etc):

t tests - Linear multiple regression: Fixed model, single regression coefficient

Analysis: A priori: Compute required sample size
Input: Tail(s)                       = Two
Effect size f²                 = 0.15
α err prob                     = 0.01
Power (1-β err prob)           = 0.99
Number of predictors           = 16
Output: Noncentrality parameter δ     = 4.9598387
Critical t                     = 2.6096879
Df                             = 147
Total sample size             = 164
Actual power                   = 0.9900354

Given the sample sizes involved, if SWGDE intended to guide the practitioner in building a performance model of the source DVR, the guidance should note the difference in Power between the two samples. Nevertheless, the appropriate sample size (the amount of complete tests necessary to produce your range of frame rate values) is between 90 and 160 (as computed above).

Reporting

Once the requisite number of tests (samples) have been calculated, the results will be given as a range of values and not a single number. This range could then inform a speed calculation, which will also result in a range of possible values.

Limitations

Given the manufacturing process, there will be several failures in the test phase. These should be noted. Additionally, the total number of samples should only include completed tests that generated valid data. This will likely increase the number of attempts by +/- 5%.

Our tests in this area have shown that there is a small variability in timing when the file is meant to represent 1 minute of real time. However, when the file is mean to represent more than 30 minutes, as is often the case in DVR files, the minute to minute variability is actually quite large. Thus, the experiment should attempt to replicate the data generation conditions of the evidence source.

Current Text:

“Commercial or open source tools are available to aid in the determination of speed, duration, and timing of events captured on video for both investigative and forensic examinations in civil and criminal litigation. For example, a frame information report can be generated with FFmpeg (See SWGDE Technical Notes on FFmpeg, Section 11.3).”

Issue:

As noted above, the frame information report generated with FFmpeg is just a “report” of what’s in the container, not a “determination” of any kind.

As a final point, the document references SWGDE Technical Notes on FFmpeg, which I will address in the next post.

As always, if you care to comment, please do so (politely) below.

Tuesday, July 23, 2019

SWGDE Best Practice for Frame Timing Analysis of Video Stored in ISO Base Media File Formats - draft for public comment

In case you missed it, SWGDE released several drafts for public comment. Ordinarily, I review the ones that pertain to the disciplines in which I'm engaged and offer comments where necessary. However, in the last year or so, the communication has been rather one-directional. I've not heard back that my comments were received or considered. Neither did my suggestions make it into the published version. Thus, given the lack of communication and transparency (more on that in a future post), I'm going to publish my comments here for each of the drafts, as well as submitting them to the SWGDE secretary.

We'll start with SWGDE Best Practice for Frame Timing Analysis of Video Stored in ISO Base Media File Formats.

My first issue with this document can be found in pages 4 - 8. Throughout these sections, the word / process “determine” is used. Given the guidance provided, this is the incorrect verb. To "determine" is to ascertain or establish exactly, typically as a result of research or calculation. No calculation method is provided in this document. No experimentation is recommended.

The document describes wherein various sections of an output report from ffmpeg one can find information. This is not a “determination” in a scientific sense, but a reporting of what is contained in the output report of specific processes.

Thus, the correct verb is “report,” and not "determine." This is supported by the Scope statement that the guidance is not intended to be used to inform a conclusion. Conclusions, opinion based testimony, are the results of analysis / experimentation, which is not the focus of this guide.

Given the Scope statement, and the fact that one is simply reporting the information in the container, and not establishing the “ground truth” of time, the I've suggested a complete elimination of the statement found in Limitations statement, "The concepts in this document may be used as part of investigations into determining object speed in recorded video."  A reported time is not appropriate in a speed calculation. A determined time, established via experimentation (as explained below) may be used - but this document does not describe a valid experimental process for  establishing the accuracy of the information in the data container.

My second issue deals with Section 8 and 9.

Experimental Design

It’s a fundamental principle of experimental science that one can only measure “now.” For “then,” in any direction of time, one must predict by designing, testing, validating, and implementing a prediction model. This is certainly possible to do with DVRs and frame timing via multiple logistic regression. In this way, all of the variables can be controlled and a range of values computed.

If you were conducting research and comfortable with an error probability of .05 on both ends (Type 1 / Type 2), then the protocol for the sample size calculation would look like this (controlling only for the number of camera views and not for the various data choke points, and etc):

t tests - Linear multiple regression: Fixed model, single regression coefficient

Analysis: A priori: Compute required sample size
Input: Tail(s)                       = Two
Effect size f²                 = 0.15
α err prob                     = 0.05
Power (1-β err prob)           = 0.95
Number of predictors           = 16
Output: Noncentrality parameter δ     = 3.6742346
Critical t                     = 1.9929971
Df                             = 73
Total sample size             = 90
Actual power                   = 0.9520952

If you were conducting this test in a criminal justice proceeding, the error probability should be lower: .01 on both ends (Type 1 / Type 2). The protocol for the sample size calculation would look like this (controlling only for the number of camera views and not for the various data choke points, and etc):

t tests - Linear multiple regression: Fixed model, single regression coefficient

Analysis: A priori: Compute required sample size
Input: Tail(s)                       = Two
Effect size f²                 = 0.15
α err prob                     = 0.01
Power (1-β err prob)           = 0.99
Number of predictors           = 16
Output: Noncentrality parameter δ     = 4.9598387
Critical t                     = 2.6096879
Df                             = 147
Total sample size             = 164
Actual power                   = 0.9900354

Given the sample sizes involved, the guidance should note the difference in Power between the two samples. Nevertheless, the appropriate sample size (the amount of complete tests necessary to produce your range of frame rate values) is between 90 and 160 (as computed above).

With a sample size of 5, as illustrated in the document, the analyst has more chances of being wrong about frame timing than being right.

Reporting

Once the requisite number of tests have been conducted, the results will be given as a range of values and not a single number. This range will then inform a speed calculation, which will also result in a range of possible values.

Limitations

Given the manufacturing process, there will be several failures in the test phase. These should be noted. Additionally, the total number of samples should only includ completed tests that generated valid data. This will likely increase the number of attempts by +/- 5%.

Our tests in this area have shown that there is a small variability in timing when the file is meant to represent 1 minute of real time. However, when the file is mean to represent more than 30 minutes, as is often the case in DVR files, the minute to minute variability is actually quite large. Thus, the experiment should attempt to replicate the data generation conditions of the evidence source.

Conclusion

I certainly hope that the SWGDE membership consider my comments. Again, I've submitted them via email to the SWGDE secretary, providing the requisite information.

I invite your comments below on what I've presented.

Friday, July 19, 2019

What is forensic science

As I prepare to head out to Orlando for next week's OSAC in-person meeting, I want to revisit one of the papers that the OSAC has issued since it's founding.

Consider that the OSAC is a group that includes all forensic science disciplines. Thus, in harmonizing the language used to describe what should be a simple term - forensic science - much work was done to arrive at a definition that works for all forensic science disciplines.

I've shared the highlights in several posts and papers. Here's the full discussion.

---
2. Forensic Science
A definition of forensic science should focus on the evidence scrutinized and the questions answered by the inquiry. After extensive research, surveys, and discussions, the TG formed the following understanding of the aim and purpose of forensic science:

Traces are the fundamental objects of study in forensic science. A trace is a vestige, left from a past event or activity, criminal or not. The principle that every contact leaves a trace was initially attributed to Edmond Locard, and has evolved into a new definition of the trace to include a lacuna in available evidence, as well as activities in virtual settings (Jaquet-Chiffelle, 2013):
A trace is any modification, subsequently observable, resulting from an event.

This is not to suggest that all forensic questions involve event reconstruction, merely that all traces involve some modification. Even immutable objects can be a trace when their occurrence in relation to a forensic inquiry is the consequence of an event (e.g., a mobile device identifier deposited at a crime scene, or DNA transferred onto a victim). The modification can affect an entity in an environment or the environment itself. Its nature can be physical or virtual, material or immaterial, analog or digital. It can reveal itself as a presence or as an absence.

Forensic science addresses questions, potentially across all forensic disciplines. These questions are addressed using a specific and finite number of core forensic processes. For the purpose of this document, these processes are labeled as: 1) authentication, 2) identification,
3) classification, 4) reconstruction, and 5) evaluation.

The following definition of forensic science emerged from this work:
The systematic and coherent study of traces to address questions of authentication, identification, classification, reconstruction, and evaluation for a legal context.

The term systematic in this definition encompasses empirically supported research, controlled experiments, and repeatable procedures applied to traces. The term coherent entails logical reasoning and methodology. This definition uses legal context in the broadest terms, including the typical criminal, civil, and regulatory functions of the legal system, as well as its extensions such as human rights, employment, natural disasters, security matters.

---

Continuing on ...

---

3. Digital/Multimedia Evidence
To understand the scientific foundations of digital/multimedia evidence and how this fits into forensic science, it is necessary to consider the specializations of digital/multimedia evidence. Digital/multimedia evidence encompasses the following sub-disciplines (ed. note: edited for brevity), which are organized according to the current OSAC structure:

Video/image technology and analysis: handling images and videos for forensic purposes. This includes classification and identification of items, such as comparing an item in an image or video with a known item (e.g., car, jacket). This also includes authentication of images and videos, metadata analysis, Photo Response Non-Uniformity (PRNU) analysis, image quality assessment, and detection of manipulation. Operational techniques include image and video enhancement and restoration.

Digital evidence: handling digital traces for forensic purposes, including classification and identification of items, activity reconstruction, detection of manipulation (e.g., authentication of digital document, concealment of evidence). Within the current OSAC structure, audio recordings are treated as a form of digital evidence for enhancement and authentication purposes.

The foundational sciences for the various sub-disciplines of digital/multimedia evidence are primarily biology, physics, and mathematics, but also include: computer science, computer engineering, image science, video and television engineering, acoustics, linguistics, anthropology, statistics, and data science. Principles of these, and other disciplines, are applied to the traces, data, and systems examined by forensic scientists. Study of foundational principles in digital/multimedia evidence is ongoing, with consideration for their suitability in forensic science applications.

Furthermore, many digital traces are changes to the state of a computer system resulting from user actions. In this context, the discovery of principles in how computer systems function, is a fundamental scientific aspect of digital/multimedia evidence. The systematic and coherent study of digital/multimedia evidence is made more complicated by the evolving nature of technology and its use. While the foundations of digital/multimedia evidence are largely in computer science, computer engineering, image science, video and television engineering, and data science, the underlying digital traces are, in large part, created by actions of operating systems, programs, and hardware that are under constant development. As a result, it will not always be possible to test in advance the performance of such systems under every possible combination of variables that may arise in casework. However, it may be possible, to test the performance of a particular system under a particular set of variables in order to address questions arising in a specific case. For instance, digital documents created using a new version of word processing software can exhibit digital traces that were not previously known. The observed traces can be understood by conducting experiments; studying the software under controlled conditions. In this manner, generalized knowledge of digital/multimedia evidence is established and can be used by any forensic scientists to obtain reproducible, widely accepted results.

---

It's this last paragraph that I'll finish with. Notice these statements:

  • "However, it may be possible, to test the performance of a particular system under a particular set of variables in order to address questions arising in a specific case."
  • "The observed traces can be understood by conducting experiments; studying the software under controlled conditions. In this manner, generalized knowledge of digital/multimedia evidence is established and can be used by any forensic scientists to obtain reproducible, widely accepted results."
These statements have to do with validation and experimental design. Are you validating your tools? Are you conducting experiments, following the rules of experimental design?

If you'd like to explore these concepts, we've got classes that address most of the topics illustrated in this section of the document. Check out our calendar. If you find a date / class that works for your schedule, sign up. If you can't find a date that works, suggest one. We're here to help.

See you in Orlando.

Wednesday, July 17, 2019

Natural Philosophy

In designing curricula for the digital/multimedia forensic sciences, I have spared no pains to secure clearness of expression, precision in definitions, and accuracy in the statement of facts. Wherever possible, references are quoted directly so that students are brought into the greater discussion of meanings, concepts, and percepts.

I've written previously about the value of a classical liberal arts education. One of the foundational courses of science is Natural Philosophy. Yes, Wikipedia describes Natural Philosophy as "the philosophical study of nature and the physical universe that was dominant before the development of modern science. It is considered to be the precursor of natural science." This is only partially true. For example, the classification systems we employ today find their roots in Aristotle's Categories. Aristotle's Categories still inform the discussion of colour and light today. An example of this are Temperature/Tint adjustments of colour that affect the quality of light. If you read a colour adjustment tutorial and have no idea what they're talking about when they talk about a light's quality, this is what they're talking about ... and it comes from Aristotle. For a massively deep dive on the subject of light, you can read this amazing article from 1952: Richard Kelly–Lighting as an Integral Part of Architecture.

This is the type of deep-dive that you get in our classes, that isn't necessarily featured by other instructors who only demonstrate what buttons to push on a particular piece of software. Yes, you were shown, "officially," what buttons perform what functions. But, you were short changed when you didn't learn the deeper meanings of processes, or their history.

Another example comes from our courses' treatment of audio. We have stand-alone Forensic Audio Analysis courses. We also weave the analysis of audio into our Forensic Multimedia Analysis and Redaction courses.

I've encountered a shallow treatment of audio topics in many courses, with the instruction offering banal trivia on the human range of hearing that conflicts with not only modern science, but Natural Philosophy. It also conflicts with my own experiences with Sensory Processing Disorder / Sensory Integration Disorder (SPD).

Here's a simplistic view of the range of typically wired human hearing. "Humans can generally hear sounds with frequencies between 20 Hz and 20,000 Hz (the audio range or hearing range) although this range varies significantly with age, occupational hearing damage, and gender. The majority of people can no longer hear 20,000 Hz by the time they are teenagers, and progressively lose the ability to hear higher frequencies as they get older. Most human speech communication takes place between 200 and 8,000 Hz and the human ear is most sensitive to frequencies around 1,000-3,500 Hz. Sound above the hearing range is known as ultrasound, and that below the hearing range as infrasound."

So bland. So meaningless. Yet, so often found in audio analysis courses and papers. It's missing the meat. It's missing the "so what" and the "why."

Here's a view of the same topic from Norton's 1870 contribution to the Eclectic Education Series, Elements of Natural Philosophy.

  • 402. Limits of hearing. All ears are deaf to some vibrations. The gravest sound perceptible to the human ear is produced by sixteen complete vibrations in a second; the highest sound is caused by thirty-eight thousand com­plete vibrations in a second. The auditory range is not the same for all persons. Some can not hear the highest notes of a piano, others are insensible to the note of a cricket, or even the chirrup of a house swallow. The hearing of these persons may be exceedingly acute within their limit; that is, they may be able to distinguish very feeble sounds, as the lowest whisper. 
  • 403. The distance at which sound is audible varies with its original intensity and the circumstances which modify it. Still air, of great density and uniform temperature, is favorable to the transmission of sound. Under ordinary circumstances, a powerful voice is distinct at a distance of seven hundred feet. In the arctic regions, Lieutenant Foster conversed with a sailor at the distance of a mile and a quarter. The cry of a sentinel, "All's well," has been conveyed, in still air, over calm water, ten miles. Winds and currents increase or diminish the conducting power of air, according to their direction and force. The earth transmits sound further than air. The cannonading at Antwerp, in 1832, was heard in the mines of Saxony, three hundred and twenty miles distant. 

Norton's treatment of the subject features not only the quality of sound and the range of hearing, but a discussion of the medium necessary for sound propagation - from 1857. Interestingly, Norton's range spans from 16-38,000 "com­plete vibrations in a second." Yes, you might know Heinrich Rudolf Hertz's unit of frequency (Hz) - but those that constitute the Trier of Fact might not. Studying the classics gives the richness that helps answer the "so what" and the "why."

Which buttons to press, and in which order, really only makes lasting sense if you have a grasp of the underlying Natural Philosophy. If you'd like to learn these foundational subjects, we'd love to see you in one of our upcoming courses.

See you soon...

Wednesday, July 10, 2019

What is "official training?"

Years ago, sitting in LEVA's old class, Forensic Video Analysis and the Law, Grant Fredericks stood at the front of the class room and delivered a module on the use of Photoshop to a room full of civil servants. To the best of my knowledge, Grant is not, nor has he ever been, an Adobe Certified Expert (ACE) in Photoshop. Very few forensic video analysts are ACEs as we only utilize about 1/3 of what Photoshop can do, and the ACE test is comprehensive. Thus, the ACE isn't necessarily a good metric or barometer of a forensic video analyst's grasp of Photoshop.

Years later, sitting in another LEVA facilitated class, George Reis and Casey Caudle presented a deeper dive into Photoshop. George might be / have been an ACE, but Casey wasn't.

I have presented countless Photoshop training classes and have written a book on Photoshop for forensic analysts.

None of this was ever "official" Adobe activity. Even when I presented two days of Photoshop training in the main training room at Adobe in San Jose, it wasn't "official" Adobe training. It was Adobe being nice and gracious in their sponsorship of NaTIA, under whose banner I was teaching.

Has a lack of "official" Adobe training every been a problem for you in your testimonial experience? I've been asked about my training. I've listed it on my CV. Syllabus files are available for the courses I teach. But, case law can't be found where an analyst was excused due their having been trained in Photoshop, but not during an "official" Adobe training event.

Professional photographers and digital artists know the name, Scott Kelby. Scott has an amazing record of training photographers and artists in the intricacies of Photoshop and the various Adobe tools. He's assembled the best and brightest to his brand, offering amazing value. Prior to Beckley in California, I was a member his National Association of Photoshop Professionals. NAPP is not an "official" Adobe endeavour. It's a group of likeminded individuals, supported by Adobe. But it's Kelby's thing, not Adobe's.

Elsewhere in our tool set, countless analysts have sat through Larry Compton's FOSS tool lectures and courses. With FOSS tools, there's really no possibility of an "official" training. If you use the popular ffmpeg, did you travel to France to get training from the originators of the tool? Likely not. You may have read a bit on the net, browsed the wiki, reached out to someone like Larry for advice, then hit the command line with gusto. Are your results invalid because Larry isn't authorized to deliver training on his basket of  tools? Validity isn't a function of the "officialness" of the instruction you've received.

In the period between 2012 and early 2019, I was in the employ of Amped Software, Inc. I delivered training on Amped's product line in the US, Canada, and South Africa. I went on site to teach a single agency's staff. I hosted courses in our offices with a mix of people from over 40 countries. The curriculum that I delivered was (and still is) informed by the context in which the courses were presented.

I'm in a unique position, being a trained (CA POST) and educated (WGU MEdID) curriculum designer. The focus of my training sessions has always been "tool-assisted training to competency," as defined by the ASTM and informed by my years of experience in law enforcement. I've focused on the discipline, which is enabled by a toolset. In this way, the only real change from Photoshop to any other tool was the platform. In my Introduction to Forensic Multimedia Analysis courses, the tool is incidental to the delivery of knowledge around the standards, practices, laws, and procedures. Sure, certain platforms make the job easier. In teaching a toolset, like Photoshop or FIVE, I always frame the curriculum around standards compliance and best practices - and present the tool-specific portions after the foundation has been set.

I've written about the new freedom that I'm enjoying now that Amped have retreated to their mail box in Brooklyn. The plethora of classes now on my calendar are not "new." The curriculum development was done years ago. I've wanted to offer these for quite some time to the general market. I have presented them to select US federal service staff on several occasions - causing problems in the process. You see, I was not "allowed" to do so by Amped. Their thinking was simple, I was the only one offering these classes. It would cause a business problem for them if their customers in Russia, for example, wanted to take the courses given that I, as a US citizen, would not be able to deliver them. Amped, having been founded in the epicenter of Italian Marxism, wanted a standard offering at a standard price. Amped, being an Italian company, is free to offer it's products and services to agencies in countries prohibited to US based businesses, like Russia.

If you want to pay a premium to receive your information from a person with an "ampedsoftware.com" email address, please do so with my blessing. If you want real-world guidance, training to competency, and an emphasis on science and standards, designed and delivered by someone who has been there and is still doing that, I'd love to see you in one of my classes. You can find our training calendar here.

Until then, have a great day my friends.

Tuesday, July 9, 2019

FRE Rule 26 asks, where are your notes?

Remember your math teacher demanding that you "show your work." Forensic science is no different. We want to see your process, your thoughts, your notes ...

"The law, however, expects expert witnesses to write reports for the court, outlining their opinions on the matters about which they are testifying. In the eyes of the law, the expert's testimony is allowed by the court only to the extent that the expert adequately documents the basis for it in a report and makes it available to the opposing party before the trial." - from A Guide to Forensic Testimony: The Art and Practice of Presenting Testimony As An Expert Technical Witness (link).

The key to understanding how the legal process requires you to document your opinions lies in FRE Rule 26.

(2) Disclosure of Expert Testimony.
In addition to the disclosures required by paragraph (1), a party shall disclose to other parties the identity of any person who may be used at trial to present evidence under Rules 702, 703, or 705 of the Federal Rules of Evidence.

Except as otherwise stipulated or directed by the court, this disclosure shall, with respect to a witness who is retained or specially employed to provide expert testimony in the case or whose duties as an employee of the party regularly involve giving expert testimony, be accompanied by a written report prepared and signed by the witness. The report shall contain a complete statement of all opinions to be expressed and the basis and reasons therefore; the data or other information considered by the witness in forming the opinions; any exhibits to be used as a summary of or support for the opinions; the qualifications of the witness, including a list of all publications authored by the witness within the preceding ten years; the compensation to be paid for the study and testimony; and a listing of any other cases in which the witness has testified as an expert at trial or by deposition within the preceding four years.

These disclosures shall be made at the times and in the sequence directed by the court. In the absence of other directions from the court or stipulation by the parties, the disclosures shall be made at least 90 days before the trial date or the date the case is to be ready for trial or, if the evidence is intended solely to contradict or rebut evidence on the same subject matter identified by another party under paragraph (2)(B), within 30 days after the disclosure made by the other party. The parties shall supplement these disclosures when required under subdivision (e)(1).

You see the emphasis made on turning over everything in discovery. Notes, reports, and your complete CV. In the world of e-discovery ... this also means notes and reports generated by the programs that we use - if any are created.This means the History Log, in Photoshop or Amped FIVE, is more than likely discoverable. It's a record of the steps the software has taken (at your direction) on an image or series of images, video, or ...

Bottom line, take good notes. If you'd like to know more about taking good notes, standards compliance, and other foundational aspects of forensic video analysis, check out our on-line course Introduction to Forensic Multimedia Analysis.

Monday, July 8, 2019

New Course Announcement

Today marks the beginning of a new era in our training business. I'm happy to announce that Introduction to Forensic Multimedia Analysis is now available on-line as micro learning.

What is Introduction to Forensic Multimedia Analysis? What does it cover?

This class serves as the entry point for those analysts involved in forensic image and video analysis, aka Forensic Multimedia Analysis. It features a detailed overview of the technological, legal, and personnel aspects of working with multimedia evidence in a forensic science setting. Graduates will acquire an introductory level education in the knowledge necessary to evaluate and analyze multimedia evidence in a forensic science setting. With an emphasis on standards compliance, students will also learn the best practices of reporting, packaging, delivering, and presenting their findings and evidence in the criminal / civil court context.

The curriculum is presented with strong emphasis on the work flow required to process this unique type of evidence. Beginning with the end in mind, each example is presented and students learn from the standpoint of an eventual testimonial experience.

In focusing on fundamental knowledge, the information in this course is not specific to any analysis platform or software solution.

Yes. That's right. Regardless of the tools you use, this class can help contextualize the work. It fills in the gaps that often happen with tool-specific training courses.

You've learned which buttons to press. You've learned how the tools work. Now learn the standards and the science behind the discipline. This knowledge - being able to answer "why" type questions, is essential on your career path as an analyst.

As an example, from the course's content, do you know the fundamental difference between the HSL and HSB colour spaces? HSL was designed to closely align with the way human vision perceives colour-making attributes. HSB was designed as a way to try to describe colour more intuitively than with RGB numbers. This level of detail will help you understand the tools that you use at a deeper level, as well as help flavour your testimony - proving plain meaning to industry jargon.

In the next few months, we'll be rolling out other courses in our online portal ... starting with Introduction to Image Authentication. Then, we'll bring out our offerings around the Amped and Adobe tools, followed by deep dives into some of the more popular freeware tools, like ffmpeg.

Here's why:

Many who have been given the task to work on this incredibly valuable evidence type live in states that restrict travel. Others work in cities, counties, and states with budget problems. Traveling to training thus becomes a problem. For every dollar spent on training, two or more dollars must be spent on logistics. That formula isn't sustainable.

Often times, analysts in small towns are disadvantaged as they lack the funds to travel and they lack the population density to attract a training company to their location. Additionally, many municipalities have strict rules prohibiting private companies from profiting off of government resources (no more trading seats for space). Some training companies have thresholds of attendance that must be met in order to hold the training. Signing up, only to have the class cancelled a few weeks out, can cause chaos with your training office.

Another issue that our learning portal solves is the disparity between public and private labs in terms of training opportunities. The majority of training classes are held at government / law enforcement facilities and restricted to those types of employees.  Thus, how are those in private practice supposed to train? With us, they can. Everyone can. Remember, there really are no "sides" in science. We all serve the Trier of Fact in interpreting, analyzing, and explaining the issues around this vital type of evidence.

We've worked hard to check all of the boxes in delivering these offerings to you:

  • Affordable
  • Convenient
  • Relevant
  • Feature Packed
Our introductory education courses are priced below $500. This means that approvals will be easier. This means that you can likely justify the expense and use your purchase card to sign up for classes. Also, given the nature of on-line micro learning, this course is priced to be affordable wherever in the world you happen to be.

Yes, we've designed the course with the world in mind. So many courses are designed with a specific country's context informing the development. We've blended relevant standards and procedural information from Australia, Canada, India, Europe, New Zealand, South Africa, the United Kingdon, and the US ... as well as international standards from the ASTM the world wide working groups.

Having a world wide context, and a platform available on all major device types (even iOS and Android), makes this course ... and all of our courses ... incredibly convenient.

As for relevance, we're filling in the gaps and providing context to what you may already know. We're diving deeper than you've likely gone before. We're setting the scientific foundation for the work. Given that forensic science is the systematic and coherent study of traces to address questions of authentication, identification, classification, reconstruction, and evaluation for a legal context, this course presents the system and leads the way to a coherent study of traces to address the questions that come up in your inquiries. It will provide valuable information and knowledge that will help you make sense of the tools and processes used in this complex discipline.

The course is presented as a 16 hour experience - meaning this course would normally fill two full days of instruction. But, with the use of interactive text, you can dive as deep into topics as you desire. If you dive deep on all topics in the course, this course expands to beyond 40 traditional classroom hours of instruction. As a self-directed learner, the choice is yours.

The beauty of micro learning is that you learn at your own pace and on your own schedule. All too often, the "death by Powerpoint" section of class puts students to sleep. In sleeping through the lessons, they miss vital information. Micro learning lets you explore topics in depth, or in small bites, as your schedule and energy level permits. Once enrolled, you have 60 days to complete the course.

I invite you to examine the syllabus for this course. Then, please click over to the web site and sign up. I look forward to seeing you in class.


Tuesday, July 2, 2019

Yep! They Did It Again.

I've been teaching classes on multimedia analysis for a while now. Almost a decade ago, I created a curriculum and began teaching classes for Amped Software, Inc. in the US and elsewhere. One of the things I made sure to focus on was the specific use case for my students. If I was in Georgia, I would focus on Georgia law. Likewise for Ontario Provence or South Africa.

In taking a walk through the program, I would point out the Program Options within FIVE (View>Program Options), and we'd spend the better part of the morning going over each setting and it's implications for the analyst's work.

What took the most time to explain was the Log Files option. By default, Amped SRL's FIVE creates a log file that records every step from the moment you launch the program to the time you shut it down. It has since Build 4376, as I noted in this old post from 2012.

Amped SRL's Marco Fontani explains, in this recent blog post, what a wonderful thing the log files are - in his opinion. It's important to remember that Marco is in Italy, where the rules and laws are quite different from elsewhere in the world.

What is a log file? What's in it? Marco explains:

We open the file and find an impressive amount of information. The file begins with the product info: release number and build date, info about the computer and operating system, info about installed applications that may influence report creation (Microsoft Word in this case). Then, we have a full history log of every move we made while using Amped FIVE: which filter we added, how we configured (and possibly re-configured) it, which filter we deleted, and so on.

Wait. What?

You read that right. In case you missed my 2012 blog post, everything about your operating environment. Every single step. The time each step was performed. Everything. Recorded.

Did you know that FIVE was doing that?

If you have been to one of my classes, you do. If you're a long time reader of this blog, you do. I'm the one that told you to turn off this feature and only activate it under one of two very specific situations.

  1. If you were having an issues with FIVE and you were specifically told by an Amped Software, Inc, employee to activate it in order to step through the problem.
  2. Outside of troubleshooting - if you have specific permission from your chain of command to generate / create this type of potentially exculpatory information for your case work
If you received a discovery request for any / all files related to the case in which you were a part, and you didn't disclose this information (that you didn't know existed). Once created, they have to be saved with the case. Please don't delete potentially exculpatory or other case related info unless specifically directed to by your chain of command (or by statute). This isn't legal advice as such, it's just my experience as an analyst in government service who also served as a Union Rep & Shop Steward.

The other little Jack-in-the-Box to Marco's post, the one he hints at but doesn't say explicitly, is that the log is a record of everything from Open of FIVE to Close. Why would that be important?

If you've been in my classes, you'll remember me saying (rather regularly) to shut down FIVE after processing your case and restart it to begin a new case. Yes. I told you about the log files. But, just in case you forgot later, and FIVE resets itself or you move to a different computer, I wanted to get you in the habit of refreshing FIVE between cases. Why?

Because if the log is a record of everything done, every step accepted or rejected, from Open of FIVE to Close, AND don't shut FIVE down between cases, you've just mixed cases on the same log file.

Yes, you did that and you didn't know it.  

Now, each case gets the record of what you did on the other case(s). Did you just work 30 cases without closing FIVE. Surprise, all 30 cases are on the same log file. All 30 cases' attorneys get to know what you did or didn't do on the other cases. Can you imagine the cross examination nightmare?

In case you're wondering what this might look like, here's a peek from 2009 - FRE Rule 26 and the History Log.

Before you say it, I'm not suggesting you hide anything. What I am saying is that you should, you must, seek your chain of command's counsel before activating that feature. When I did, I was told very frankly to shut it off. You see, at the LAPD, someone of my rank/pay grade did not have permission to create a new "form," which is essentially what the log file is.