Forensic Multimedia Analysis Blog - Archived

This blog is no longer active and is maintained for archival purposes. It served as a resource and platform for sharing insights into forensic multimedia and digital forensics. Whilst the content remains accessible for historical reference, please note that methods, tools, and perspectives may have evolved since publication. For my current thoughts, writings, and projects, visit AutSide.Substack.com. Thank you for visiting and exploring this archive.

Friday, August 30, 2019

Yes, you do need stats ... actually

Yesterday, I received the good news that my validation study of how a course in statistics could improve the statistical literacy of digital / multimedia forensic analysts when delivered on-line as micro-learning was published by the Chartered Society of Forensic Science in the UK. I got excited and put the good news on my LinkedIn feed.

Along with the usual emoji responses, I received the comment shown below.

Rather than simply comment there, I'd like to take the opportunity to illustrate the many ways in which it's not just me who says that the world of the digital evidence analyst can benefit from a solid foundation in statistics.

You see, the course was created because the relevant government bodies around the world have said, on a rather regular basis, that the investigative services and the forensic sciences need a solid foundation in statistics.

Starting at the US government level, there's the PCAST Report from 2016 (link): "NIST has also taken steps to address this issue by creating a new Forensic Science Center of Excellence, called the Center for Statistics and Applications in Forensic Evidence (CSAFE), that will focus its research efforts on improving the statistical foundation for latent prints, ballistics, tiremarks, handwriting, bloodstain patterns, toolmarks, pattern evidence analyses, and for computer and information systems, mobile devices, network traffic, social media, and GPS digital evidence analyses." (emphasis is mine)

CSAFE has already responded with some tools for digital forensic analysts (link). The ASSOCR tool will help analysts "determine if two temporal event streams are from the same source by through this R package that implements a score-based likelihood ratio and coincidental match probability methods."

The HEISENBRGR toolset can be used to "match accounts on anonymous marketplaces, to figure out which of them belong to the same sellers."

What about NIST? What is the issue that NIST is taking steps to address? The PCAST report notes, "The 2009 NRC report called for studies to test whether various forensic methods are foundationally valid, including performing empirical tests of the accuracy of the results. It also called for the creation of a new, independent Federal agency to provide needed oversight of the forensic science system; standardization of terminology used in reporting and testifying about the results of forensic sciences; the removal of public forensic laboratories from the administrative control of law enforcement agencies; implementation of mandatory certification requirements for practitioners and mandatory accreditation programs for laboratories; research on human observer bias and sources of human error in forensic examinations; the development of tools for advancing measurement, validation, reliability, and proficiency testing in forensic science; and the strengthening and development of graduate and continuous education and training programs."

It's that last bit that prompted me to design and validate an instructional program in statistics for forensic analysts. But, it's the first sentence that speaks to the comment from LinkedIn. Analysts don't deal in absolutes or definite - binary. The world of the computer program may be binary, but the world certainly isn't. There is a natural variability to be found everywhere. But more to the comment's point, how does an analyst know that their "various forensic methods are foundationally valid, including performing empirical tests of the accuracy of the results."

Ahh... but, you're saying, all of your support is from the United States. It doesn't apply to the rest of the world. In that, you're wrong. Let's look at the UK.

In September 2018, Members of the Royal Statistical Society Statistics & Law section (link) submitted evidence (link) to a House of Lords Science and Technology Committee inquiry on Forensic Science. Question 2 asked, "what are the current strengths and weaknesses of forensic science in support of justice?" Here's the RSS' response. Notice the imbalance between strengths and weaknesses.

I've highlighted the relevant section as it relates to this topic. "poor quality of probabilistic reasoning and statistical evidence, for example, providing irrelevant information because the correct question is not asked. For example, an expert focused on the rarity of an event, rather than considering two competing explanations of an event."

Our course on statistics for forensic analysts seeks to teach probabilistic reasoning, exploring the differences between objective and subjective statistics, as well as the fact that most of the forensic sciences currently work in the wold of abductive reasoning (taking your best shot).

Now there's the accusation that digital analysts are often engaged in "push button forensics." We buy tools from vendors and hope that they're fit for purpose and accurate in their results. But are they? We don't know, so we validate our tools (hopefully). If you're trusting the market to deliver reliable, valid, and accurate tools, you may be disappointed. As the above referenced report notes, "What can be learned from the use of forensic science overseas? Seen from continental Europe, there has been a loss of an established institution (FSS) with a profound body of knowledge. Now research seems scattered among different actors (mainly academic), as commercial providers might have other priorities and limited resources to invest in fundamental research." (emphasis mine)

To the Royal Society's point, if you're a digital analyst and there's a challenge to your conclusions or opinions, on what do you base your response or your work? For example, you've retrieved photos from a computer or phone. Your tool automatically hashes the files. But, a cryptographic hash does not guarantee the authenticity of the file, only places a unique value into the process to handle questions of integrity. How do you conduct an authenticity examination without a knowledge of statistics? You can't. How do you validate your tools without a knowledge of statistics? You can't.

Over in Australia (link), there is agreement on the need for training and research - just what I've presented. "There is however one aspect of the report with which the Society is in complete agreement; the need for both continuous training and research in forensic science. We are also aware of the lack of funding for this research and therefore support the recommendation of PCAST that this is essential if our science is to continue to develop into the future."

To conclude, yes, you do need training / education in statistics if you're engaged in any forensic science discipline. Many practitioners arrive in their fields with advanced college degrees and thus will have had exposure to stats in college. But, on the digital / multimedia side, many arrive in their fields from the ranks of the visible policing services. They may not have a college degree. They may only have tool-specific training and may be completely unaware of the many issues surrounding their discipline. It's for this group that I've designed, created, and now validated my stats class. It's made in the US, to be sure, but it's informed by the global resources listed in this post - and many others.

I hope to see you in class.

Monday, August 26, 2019

Demonstrative exhibits and reconstruction of events

My last post generated a few responses that I want to address in a separate post, as opposed to editing the previous post. A few people got the impression that I was saying that what folks are doing with ACE's Camera Match Overlay isn't "forensic science" or "forensic video analysis." That's not at all what I was saying. Let's dive into that question to explain.

First, the definition of forensic science again: "Forensic science is the systematic and coherent study of traces to address questions of authentication, identification, classification, reconstruction, and evaluation for a legal context."

Forensic science thus includes:

authentication
identification
classification
reconstruction
evaluation

The type of work performed in the examples on ACE's websit clearly indicate that the Camera Match Overlay is a tool for reconstruction. This is how the product is being positioned in the market. Camera Match Overlay is an addition to ACE, and not part of it's basic functionality. If you're not involved in reconstruction, you can skip the Overlay tool and save a few bucks.

What ACE's basic functionality excels at is "evaluation." What's in the container? How should it be handled? Those File Triage type questions. Once answered, it's a short trip to repackaging the data in a format that is playable for the end user. But remember, evaluation has it's own set of rules.

ACE is also really good at reconstruction - the syncing and linking separate video streams. Reconstruction has it's own rules, workflow, and toolset. Reconstruction attempts to illustrate a theory of the sequence of events in question. Reconstruction is not authentication, identification, or classification - which have their own rules, workflows, and toolsets.

With that in mind, the second set of questions deals with training and tools.

A 16 hour training session on Camera Match Overlay's operation and use is likely sufficient for a technician to be able know which buttons do what functions across a variety of use cases. What it is not is a comprehensive education on photogrammetry. Because the focus of tool-specific training is the tool, we've split off the foundational education side as separate, non-tool-specific deep dives so that you get an unbiased exploration of the discipline from a neutral third party. If you're giving technician level testimony (no opinion offered), tool-specific training is likely enough. But, if you're offering an opinion (even passively), then you need a foundational education in the discipline in which you're engaged.

The third set of questions deals with the legal aspects of evidence hearings.

Keeping in mind that I'm not an attorney, consider the evidence hearing's rules (Frye / Daubert). Both types of hearings have as a foundational element what is commonly known as the “general acceptance test.” Generally accepted scientific methods are admissible, and those that are not sufficiently established are inadmissible.

Can a tool or technique without a history of publication or validation be "sufficiently established?"

Camera Match Overlay technology is new. It's the "shiny new object" for reconstruction exercises. The resulting videos become an amazing demonstrative aid to one's testimony, using the power of stunning visuals to illustrate one's theory of a case. But, bear in mind that it's only a demonstrative illustration of a single theory. There may be other theories worthy of exploration. If you're engaged in science, Daubert requires that you explore those other theories. If you're just engaged in trial support, and thus have no opinion, then go right ahead and create those stunning visuals.

All of this requires a bit of honesty. When I've simply retrieved files, I'm engaged in technician level work. When I've clarified and enlarged a frame, I've engaged in technician level work. These activities can support an analysis, and thus help to illustrate one's opinion, but they're not "analysis" in and of themselves. From the Frye ruling, "while courts will go a long way in admitting expert testimony deduced from a well-recognized scientific principle of discovery, the thing from which the deduction is made must be sufficiently established to have gained general acceptance in the field in which it belongs." When I want to offer an opinion, I must use tools and techniques that have been sufficiently established in my field. If I want to use "reconstruction" tools to reinforce my opinion in an "identification" exam, those tools must be sufficiently established within the realm of "identification." At this time, there are no studies validating the use of the Camera Match Overlay technology and methods for "identification" or "classification."

There are no studies involving the product at all. It's brand new. I'm certainly open to participating in validation studies, if anyone want to engage in our services. But for now, Camera Match Overlay seems to belong to the world of reconstruction until validated otherwise.

Thanks for reading. Have a great day my friends.

Sunday, August 25, 2019

Camera Match Overlay?

Following up on yesterday's post, we were dealing with an interesting issue that concerns the mixed methods approach of inserting a single frame taken from CCTV footage (previous event) into a laser-generated point cloud scan (current event) processed by [ fill in the blank ] software. Let's continue ...

One of my favourite scenes from the 1999 hit movie, The Matrix, gives us a clue to how this conundrum will be addressed in the near term.

Spoon boy: Do not try and bend the spoon. That's impossible. Instead... only try to realize the truth.

Neo: What truth?

Spoon boy: There is no spoon.

Neo: There is no spoon?

Spoon boy: Then you'll see, that it is not the spoon that bends, it is only yourself.

Over in southeast Washington, the folks behind iNPUT-ACE have come up with a novel approach, essentially telling us that there is no spoon to bend.

I've been curious about the Camera Match Overlay feature in ACE for a while now. I was aware of what folks were doing with point clouds and have seen some of the demonstratives in YouTube videos of Grant and Andrew's testimonial experiences. As usual, the Fredericks family has been quite open with what they're on about. It's good for business after all.

So why does the new overlay tool remind me of the spoon?

The Camera Match Overlay does not bring a point cloud into ACE. The Camera Match Overlay does not integrate with a scanner's software to insert CCTV frames or video. What the Camera Match Overlay does is quite novel, and entirely sidesteps the issue of a mixed-methods approach.

You see, when using the Camera Match Overlay tool in iNPUT-ACE, you're working in ACE to "toggle the opacity" of ACE (where the CCTV footage exists) whilst you reposition ACE's UI, and thus the footage, as an overlay to your preferred scanning tool (where the point cloud exists). This is a manual process that requires that you match the "zoom level" of the scene in your scanning tool to that of the CCTV frame(s), as well as manually positioning the ACE overlay. But, once "eyeballed," you can lock in this positioning in ACE's UI. As a technical aside, I would imagine that your GPU and monitor's performance will be critical, as well as your eyesight, in properly "eyeballing" the alignment.

Thus, there is no spoon. ACE serves up the CCTV footage (projects) with a faded opacity so that you can work in your 3D tool. I know that the Fredericks' use the legacy tool, SceneVision-3D to work with point cloud data. But, I'm sure that the overlay will work with any vendors' tools once properly set-up.

Now, on the science side, it seems that ACE can help you make some calculations of error with the assistance of your scanning tool, producing a report to address margin of error. And ... as you should know, margin of error is directly related to your region of interest. The farther away from the camera your object of interest is, the larger the error will be ... as well as your nominal resolution. It seems that ACE reports the error in a few ways - with the final number being a function of "scanner accuracy," "calibration accuracy," and "resolution accuracy." The scanner accuracy comes from your scanner. The calibration accuracy seems to come from the overlay (not sure how it handles potential mis-alignment). The resolution accuracy is really just nominal resolution for the region of interest.

For close up work, it seems that error won't be much of a problem unless, for example, your subject is within the normal distribution for human heights. But at longer distances, the error can be in the ~2'-4' range; not much if you're trying to measure a skid mark but too big if you're trying to measure a person's height. Back to that manual process of aligning views, with a high nominal resolution, being off by just a few pixels will significantly affect your error - so again, pay close attention to that step in your process.

One problem you'll have to overcome is that most scanning software lack any sort of video output to save out your pretty demonstratives. You won't be able to export your demonstratives out of ACE either - remember, it's just overlaying the UI. But, if you've got an Omnivore, or Camtasia, you've got all that you need. Remember, these are demonstratives. You're working in the world of the abductive - taking your best shot to demonstrate your theory of the case in which you are working.

Now for the disclaimers.

Don't try this without training and experience in every single piece of your puzzle. Given that these demonstratives will be used to illustrate your opinion, please get certified as an analyst. Make sure that you understand the foundation science and rules of this discipline. Get to ACE's training. Make sure that training includes an appropriate amount of time on the Camera Match Overlay tool. Make sure that the tool works with your chosen scanner software. Make sure you're trained and experienced on that tool as well. Make sure that you have training in Forensic Photographic Comparison as well as Forensic Photogrammetry. If you're using Camtasia, get trained and certified there as well (I did). If you're using an Omnivore, get trained. Why? Because in the world of demonstratives, you're demonstrating your theories, opinions, and conclusions. It's your opinion, thus it becomes all about you and the foundations of your work.

Have a good day, my friends.

Saturday, August 24, 2019

Using FARO Scene 2019 to measure within CCTV images?

Last month, FARO updated their Scene software's user guide for 2019. I was very curious to see if the guidance had changed on a controversial issue that I addressed in a letter to the editor of the Journal of Forensic Identification (link) last year.

The heart of the issue deals with the mixed methods approach of inserting a single frame taken from CCTV footage (previous event) into a laser-generated point cloud scan (current event) processed by FARO's Scene software. The question on everyone's mind is this: can you measure items / objects / subjects that are present in the CCTV image but are not present in the point cloud scan?

Let's see what the new user guide has to say.

page 80 - 08m86e00_FARO_SCENE_User_Manual_2019.1.pdf

According to FARO (see above graphic), "There are three ways to use images in SCENE:

Images can be added with their original resolution to the workspace and thus provide additional information about the scan environment.
Images can be added with their original resolution to the workspace and thus provide additional information about the scan environment. These images are imported into the 3D world into virtual scans with their full resolution. Such images will be interpreted like a high resolution scan of a plane surface and can be placed on arbitrary positions in the 3D world.
Images can be used to add color information to already existing scan points."

What is the primary reason for adding images into the scan environment? Images can be used to add "richness" and "texture" to the scan. Here's how:

page 81 of the pdf file

If you've worked with these types of scans, you know that the result is a flat grey colour. The role of images, or virtual scans, is to add information back into the scene. In the example above, the scan only registers the location of the picture frame - but not the contents of the frame. Thus, the virtual scan is utilized to add the "information" about the picture into the scan.

But that's not what you want to do, is it. You want to measure an item that isn't present in the scan.

page 85 of the pdf file

The Place in 3D function seems like it might be it. But, it's not. The Place in 3D function allows the user to place a 2D representation of a region of interest into the scan. The user then must associate points in common between the 2D image and the 3D scan.

page 86 of the pdf file

Now that you've registered the scan and associated points in common, you're ready to measure ... or are you?

You don't want to measure items in the scan. You want to measure items not in the scan. Unfortunately, you can't with this tool. There are no corresponding points between the thing you want to measure and the point cloud. The FARO Scene manual lists the procedure for measuring points in common. It does not show users how to measure the woman in the scene above, who is not present in the point cloud scan.

Thus, the question that begs asking: if a "forensic video analyst" has a scan, some CCTV footage, and FARO Scene, where is the measurement happening? If it's happening in Scene, which doesn't support the function, how accurate are the measurements? If it's not happening in FARO Scene, where is it happening?

If the measurement is not happening in Scene, how does the "analyst" get the image out?

Section 11 deals with exporting from Scene. "11.5.3 Exporting the images of the scans to.jpg format

In the Structure View, right-click the scan, then select Export> PanoramicImages. Select Scan Resolution to create images that have the same color resolution as the scan, or select Full Color Resolution if you want to create panoramic images with the highest color quality possible, and which are compensated to remove the offset between the two halves of the scan, as well as any distortion cause by the scanner’s rotation. Full color resolution panoramas have a white stripe at the bottom of the picture because the proportions of the scan and the picture are different. (Scans made with FARO scanners versions M70, S70, S350 and later create 160 megapixel images. Scans from older scanners only output panoramic images with 40 megapixel images.)

Which requires another question be asked: what's the point of bringing the CCTV frame into the point cloud scan only to bring it out again to measure?

Why is this a big issue? Science. The mixed-methods approach has yet to be validated in a general sense, with that validation published in order for "the community" to investigate the methodology and results and attempt to replicate the published experiment.

Has there ever been an evidence hearing (Frye / Daubert hearing) on this? If you're aware of one, I'd sure like to know. Not that evidence hearings should replace validation studies, or that evidence hearings should come before validation studies. I'm just aware, anecdotally, that people are performing this technique and testifying as to their results / conclusions. What isn't available in the anecdotes are any sense of the science or testimony as to validity.

Let's take a look at why this is important.

You'll remember that prior to Daubert, Frye was the law of the land. The Frye standard is commonly referred to as the “general acceptance test” under which generally accepted scientific methods are admissible, and those that are not sufficiently established are inadmissible. Can something without a history of publication or validation be "sufficiently established?"

The Frye Standard comes from the case Frye v. United States, 293 F. 1013 (D.C. Cir. 1923) in which the defendant, who had been charged with second degree murder, sought to introduce testimony from the scientist who conducted a lie detector test. The D.C. Court of Appeals weighed expert testimony regarding the reliability of lie detector test results. The court noted: Just when a scientific principle of discovery crosses the line between the experimental and demonstrable stages is difficult to define…. [W]hile courts will go a long way in admitting expert testimony deduced from a well-recognized scientific principle of discovery, the thing from which the deduction is made must be sufficiently established to have gained general acceptance in the field in which it belongs.

The last part of that sentence is where I want to go with Frye - in the field in which it belongs.

3D laser scans of traffic collisions are wonderful. 3D laser scans of scenes where the police have used force are a good thing. These recording methods capture the smallest details of the scene - present at the time of the scan (aka "now"). They have done so well in their documentation of scenes that they've become generally accepted among "scene recinstructionists." But, "scene reconstruction" is an entirely different function than digital / multimedia forensic analysis. For the FVA community, other methods of photogrammetry are generally accepted (e.g. single image photogrammetry). If the "scene reconstruction" folks want to work in the domain of digital / multimedia forensic analysis, then they must follow it's rules ... not theirs. This is key. Remember, validity deals with the accuracy of the measure as well as the appropriateness of the process / tool.

I was initially excited to see that FARO had issued an update. But, as you can see, not much has changed from the previous version in terms of measuring objects that aren't in the scan. Perhaps next time ...

Thursday, August 22, 2019

Fusion-based approach to digital and multimedia forensics

No tool is perfect. The problem for digital / multimedia forensic analysts has always been one of "does Tool X support the types of files that I see in my lab." Tool manufacturers do their best to support their customers, but there are just so darned many file types to keep track of.

Early on, we saw this in Los Angeles with mobile phones. California, being a CDMA state, wasn't well supported by the mobile forensics tools. We found a Korean company with an office in Los Angeles that made an amazing parsing tool that could find things in the physical that no one else could. I still rely upon FinalMobile and the staff at FinalData to help me with the processing and analysis of the most difficult files.

The same problem exists in the processing of evidence from DVRs. In Los Angeles, we were seeing DVRs from China that weren't being sold elsewhere in the US. Thus, for the tool makers that were building acquisition templates based upon what was around their developers, their tools never seemed to work well on the DVRs that I encountered in LA. I found a Chinese company that had SDK access to the Chinese DVR manufacturers ... and thus a more comprehensive support for the types of DVRs that I was seeing in LA. They're not one-at-a-timing DVRs. They work directly and cooperatively with the manufacturers to assure that they support the DVRs that are produced in China.

In terms of image / video processing, as innovation from Italy winds down and their prices in the US and Canada increase (in some cases quite dramatically), some really cool developments are happening in south-eastern Washington state. Do I believe that you should only have FIVE, or only have Input-Ace? Certainly not. There are things that each does really well, and things that each does poorly or not at all. If you can afford to, get both. If you can't afford both, look at the type of work that you're doing and see which is the most appropriate for you.

I've always preached a "fusion-based" approach to digital / multimedia forensic analysis - otherwise known as "buy one of each tool" if you can. In doing so, you'll have the greatest coverage possible for the evidence that arrives at your lab. With DVRs, there are things to like about SalvationData's Video Investigator Portable (that Chinese company mentioned above) and there are things to like about DVR Examiner. SalvationData's VIP can acquire NTFS discs - like those from Exacq Vision (we see those a lot in California). It can also perform the acquisition over the network (even over WiFi) for those cases where seizing the DVR isn't practical or legal (as is the case in California with the new digital privacy laws). It can find file fragments and organize them logically. On the other hand, not everyone can purchase software direct from China. DVR Examiner comes from Colorado - where there are people to pick up the phone when you're working (not in the central Asian time zone). It's convenient. It's also available as a package deal with Input-Ace and the whole universe of tools and services from Cellebrite.

Gone are the days of doing everything in Photoshop. A fusion-based approach just makes sense in today's world. With this in mind, you'll see more product reviews and deep dives on fusion-based workflows around some complex cases in future posts as well as in our on-line learning portal. Stay tuned. It's going to be fun.

Wednesday, August 21, 2019

Experimental Design

Before one begins any sort of research, one usually surveys the literature on the topic to see if any research has been completed and what, if anything, was concluded. Sure, the researcher has a general idea about what they want to study, but a literature review helps to inform and refine the eventual design of the study. According to Shields and Rangarajan (2013), there's a difference between the process of reviewing the literature and a finished work or product known as a literature review. The process of reviewing the literature is often ongoing and informs many aspects of the empirical research project. See what I just did, I discovered some research on literature reviews, and inserted the summary into my paragraph. Usually, there's an accompanying citation. Here it is: Shields, P., Rangarjan, N. (2013). A Playbook for Research Methods: Integrating Conceptual Frameworks and Project Management. Stillwater, Oklahoma: New Forums Press. ISBN 1-58107-247-3.

I received some feedback about the few posts I've written regarding "headlight spread pattern analysis." One was very intriguing - "... assume the premise is true, that there is uniqueness that can be discovered through experimentation. Where would you begin? What would the experimental design look like?" Hmmm....

Given the term, "headlight spread pattern analysis," there are four distinct elements - "headlamps," "the diffusion of light," "pattern matching," and a methodology for "analysis." Each of these would need to handled separately before adding the next element - the recording of this diffusion of light within a scene.

Let's just do a bit of research on the types of headlamps available to the general commercial market, leaving the other three elements for later.

Our first discovery is that we must separate the "lamp" from the "bulb." The bulb provides the "light" and the "lamp" is a system for the projection of that light.

For the lamp's housing, there are two general types: "reflector" and "projector." Of the light sources ("bulb"), there are several available types: Tungsten, Tungsten-halogen, High-intensity discharge (HID), LED, Laser. Of the "filament" type lamps, there are over 35 types available in for sale in the US, and covered by the World Forum for Harmonization of Vehicle Regulations (ECE Regulations), which develops and maintains international-consensus regulations on light sources acceptable for use in lamps on vehicles and trailers type-approved for use in countries that recognise the UN Regulations.

Given the eventual experimental design, it's important to note that the US and Canada "self-certify" compliance with the ECE Regulations. No prior verification is required by a governmental agency or authorised testing entity before the equipment can be imported, sold, installed, or used.

For a bulb's operation, there are variables to consider. There's voltage (usually 12V) and wattage (between 20w - 75w) - collectively known as "Nominal Power." Then there's "luminous flux." In photometry, luminous flux or luminous power is the measure of the perceived power of light. It differs from radiant flux, the measure of the total power of electromagnetic radiation (including infrared, ultraviolet, and visible light), in that luminous flux is adjusted to reflect the varying sensitivity of the human eye to different wavelengths of light.

Lots of big words there. But two stand out - luminous flux (the measure of the perceived power of light) and radiant flux (the measure of the total power of electromagnetic radiation). For our experiment, we'll need to differentiate between these as someone / some people are going to compare patterns (perception is subjective). We'll also need an objective measure of the total power of our samples. Luminous flux is used in the standard as the point of headlamps is to improve the drivers perception of the scene in front of them as they drive.

Luminous flux is measured in lumens. On our list of bulbs, the luminous flux values are reported as being between 800lm and 1750lm with a tolerance of between +/-10% and +/15%. This makes the range between 680lm and 2012.5lm. It's important to remember that the performance of a bulb over it's life span is not binary (e.g. 1550lm constantly until is stops working). Performance of lamps degrade over time.

Back to the lamp as a system. There are the general types of fixed lamps - they're bolted on to the front of the vehicle by at least four fasteners. These types need to be "aimed" at the time of installation, which can shift over time as the fasteners loosen. There are also "automatic" lamps, which feature some form of "beam aim control." These "beam aim control" types include, headlamp leveling systems, directional headlamps, advanced front-lighting system (AFS), automatic beam switching, Intelligent Light System, adaptive highbeam, and glare-free high beam and pixel light.

Now in the cases that I've reviewed, it seems that "headlight spread pattern analysis" was employed when a proper vehicle make / model determination failed due to a lack of available detail - usually due to a low nominal resolution.

Given what I've just shared above, about the potential variables in our study, an important revelation emerges. If there is insufficient nominal resolution to conduct a vehicle make / model determination, which considers class characteristics like presence / quantity of features like doors and windows before considering the presence / quantity / type of features within those items, then how could there be a determination as to type of lamp system and bulb that would be necessary for any comparison of headlight spread pattern? What if there's a general match of the shape of the pattern, but the quality of light is wrong? Or, what if the recorder's recording process and compression scheme corrupt the shape of the light dispersal or change the quality of the light? How then is a "scientific" comparison possible?

Short answer - it's not. This is one of the ways in which "forensics" (rhetoric) is used to mask a lack of science in "forensic science."

But, let's take a look at that question in a different way. Given all of the variables listed above, what would a normal distribution of "headlight spread patterns" "look like" (observed without recoding) for each of the possible combinations of system, bulb, and mounted position? What would they look like after being recorded on a DVR? This adds more variables to the equation.

For the recording system, there's the camera / lens combination, there's the transmission method, and there's the recorder's frame rate and compression scheme to contend with. Sure, you have the evidence item. But you don't know if the system was operating "normally" during that recording, or what "normal" even is until you produce a performance model of the system's operation in the recording of everything listed above. You'll need the "ground truth" of the recorder's capabilities in order to perform a proper experiment.

Remember, the recording may be "known" - meaning you retrieved it from the system and controlled it's custody such that the integrity of the file is not in question. But, what is unknown is the make / model of the vehicle. THIS CAN'T BE PRESUPPOSED. IT MUST BE DETERMINED.

In the cases that I reviewed, each comparison was performed against a presupposed make / model of vehicle - the "suspect's vehicle." If convenience samples were employed for a "comparison," then it was a few handy cars of the same make / model / year as the accused's vehicle. THIS IS NOT A VEHICLE DETERMINATION. This is no different than a single-person line-up, or a "show-up." This method has no relationship with science.

Back to the literature review and how it may inform a future experimental design.

What I've discovered is that the quantity of variables is quite large. Actually, the quantity of system types, then the variables within those systems, is quite large. This is before considering how these will be recorded by a given recorder. This information would be required to validate case work, aka a CASE STUDY. A case study is only applicable to that one case. If one wanted to validate the technique, then an appropriate amount of recorders would need to be included (proper samples of complete system types).

Given all of this, the cost of a single case study would be beyond the budget of most investigative agencies. It's certainly beyond my budget. The cost of testing the questions, "headlight spread pattern analysis has no validity" (H null) and "headlight spread pattern analysis has validity" (H1) would be massive.

Nevertheless, given all of the above, to conclude "match" - it is "the accused's vehicle" - one must rule out all other potential vehicles. Given that estimates put the number of cars and trucks in the United States at between 250-260 million vehicles for a country with 318 million people, then "match" says "to the exclusion of between 250-260 million vehicles" - which doesn't include the random Canadian or Mexican who drove their car / truck across the border to go shopping at Target. Because of this, "analysts" usually equivocate and use terms like "consistent with" or "can't include / exclude." Which, again, is rhetoric - not science.

Tuesday, August 20, 2019

Authentication education - now available on-line

I've been teaching authentication for many years. I've been all over the US and Canada presenting in classes big and small. Now, I've taken the big leap. My comprehensive educational course, Introduction to Forensic Multimedia Authentication, is available on-line as micro-learning.

This course lays the foundation for your work with your preferred tool. Current research, best practices, standards, and work flows are covered for audio, images, video, and the meta-data that are found in evidence files. If you'd like to see what is covered, here's the syllabus. This offering is the same as the in-person 40 hour course, which you can now do on-line. Because it's on-line, I can offer it to you for a much lower price than an in-person offering. Plus, you can take up to 60 days to complete it.

This course moves beyond the buttons of your preferred tool to lay the foundations for the work that you do. As such, it will assist you in explaining your work in your reports and in your testimony.

Click on the course link above. Check out the syllabus. Sign up today.

Monday, August 19, 2019

Welcome

Welcome to the Forensic Multimedia Analysis blog (formerly the Forensic Photoshop blog).

With the latest developments in the analysis of multimedia (video, audio, images, and metadata), we move the discussion beyond a single piece of software to include (in no particular order) processing and analysis fundamentals, court cases, upcoming training offerings, product reviews, current research, standards and practices, industry events and trends, and much more.

Digital / multimedia forensic analysis covers the domains of:

Authentication
Photogrammetry
Photographic Comparison
Photographic Content Analysis

Clarification, enhancement, and restoration are processes that can occur within the domains, but aren't domains in and of themselves.

We use the term digital / multimedia forensic analysis, as opposed to forensic video analysis or forensic audio analysis as we acknowledge that modern multimedia evidence potentially contains audio, images, video, as well as metadata. Thus, we need to be able to process and analyze everything that's found in the evidence files that we receive.

It's also important to define "forensic science." For this, I'll refer to "A Framework to Harmonize Forensic Science Practices and Digital/Multimedia Evidence." OSAC Task Group on Digital/Multimedia Science. 2017 (link): "Forensic science is the systematic and coherent study of traces to address questions of authentication, identification, classification, reconstruction, and evaluation for a legal context."

What is a trace? "A trace is any modification, subsequently observable, resulting from an event." You walk within the view of a CCTV system, you leave a trace of your presence within that system. You send a text, you leave a trace on your phone.

Within this framework, and wherever possible, we'll frame our discussion around standards and science. Validity, reliability, and reproducibility will be our goals ... not to present something unique and proprietary that only we can do here, but to illustrate the science behind the tools and techniques so that you can do it too.

I hope you enjoy your time here.

Jaime

Sunday, August 18, 2019

First, do no harm

In an interesting article over at The Guardian, Hannah Fry, an associate professor in the mathematics of cities at University College London, noted that mathematicians, computer engineers, and scientists in related fields should take a Hippocratic oath to protect the public from powerful new technologies under development in laboratories and tech firms. She went on to say that "the ethical pledge would commit scientists to think deeply about the possible applications of their work and compel them to pursue only those that, at the least, do no harm to society."

I couldn't agree more. I would add forensic analysts to the list of people who should take that oath.

I look at the state of the digital / multimedia analysis industry and see places where this "do no harm" pledge would re-orient the relationship that practitioners have with science.

Yes, as someone who swore an oath to protect and defend the Constitution of the United States (as well as the State of California), and as someone who had Bill Bratton's "Constitutional Policing" beaten into him (not literally, people), I understand fully the relationship between the State and the Citizen. In the justice system, it is for the prosecution to offer evidence (proof) of their assertions. This simple premise - innocent until proven guilty - separates the US from many 'first world" countries.

I've been watching several trials around the country and noticed an alarming trend - junk procedures. Yes, junk procedures and not junk science as there seems to be no science to their procedures - which serve as a pretty frame for their lofty rhetoric. This trend can be beaten back, if the sides agree to stick to the rules and do no harm.

Realizing that I've spent a majority of my career as an analyst in California, and that California is a Frye state, I'll start there in explaining how we, as an industry, can avoid junk status and reform ourselves. Let's take a look.

You might remember that prior to Daubert, Frye was the law of the land. The Frye standard is commonly referred to as the “general acceptance test” under which generally accepted scientific methods are admissible, and those that are not sufficiently established are inadmissible.

The D.C. Court of Appeals weighed expert testimony regarding the reliability of lie detector test results. The court noted: Just when a scientific principle of discovery crosses the line between the experimental and demonstrable stages is difficult to define…. [W]hile courts will go a long way in admitting expert testimony deduced from a well-recognized scientific principle of discovery, the thing from which the deduction is made must be sufficiently established to have gained general acceptance in the field in which it belongs."

The last part of that sentence is where I want to go with Frye - "in the field in which it belongs."

There is an emerging trend, highlighted in the Netflix series Exhibit A, where [ fill in the type of unrelated technician ] is venturing into digital / multimedia analysis and working cases. They're not using the generally accepted methods within the digital / multimedia analysis community. They're not following ASTM standards / guidelines. They're not following SWGDE's best practices. They're doing the work from their own point of view, using the tools and techniques common to their discipline. Often times, their discipline is not scientific at all, and thus there is no research or validation history on their methods. They're doing what they do, using the tools they know, but in a field where it doesn't belong. Their tools and techniques may be fine in their discipline - but there has been no research on their use in our discipline. Thus, before they engage in our discipline, they should validate them appropriately - in order to do no harm.

Let's look at this not from the standpoint of my opinion on the matter. Let's look at this from the five-part Daubert test.

1. Whether the theory or technique in question can be and has been tested. Has the use of [ pick the method ] been tested? Remember, a case study is not sufficient testing of a methodology according to Daubert.

2. Whether it has been subjected to peer review and publication. There are so few of us publishing papers, and so few places to publish, that this is a big problem in our industry. Combine that with the fact that most publications are behind paywalls, making research on a topic very expensive.

3. It's known or potential error rate. If there is no study, there really can't be a known error rate.

4. The existence and maintenance of standards controlling its operation. If it's a brand new trend, then there really hasn't been time for the standards bodies to catch up.

5. Whether it has attracted widespread acceptance within a relevant scientific community. The key word for me is not "community" but "scientific." There are many "communities" in this industry that aren't at all "scientific." Membership organizations in our discipline focus on rapidly sharing information amongst members, not advancing the cause of science.

So pick the emerging trend. Pick "Headlight Spread Pattern." Pick "Laser Scan Enabled Reverse Projection." Jump into any research portal - EBSCO, ProQuest, or even Google Scholar. Type in the method being offered. See the results ...

The problem expands when someone finds an article, like the one I critiqued here, that seemingly supports what they want to do, whilst ignoring the article's limitations section or the other articles that may refute the assertions. This speaks to the need for a "research methods" requirement in analysts' certification programs.

If you're venturing into novel space, did you validate your tool set? Do you know how? Would you like training? We can help. But, remember that people's lives, liberty, and property are at stake (and they're innocent until proven guilty), can we at least agree to begin our inquiries from the standpoint of "first do no harm?"

Tuesday, August 13, 2019

Four Kinds of Science

As an non-verbal autistic person, "language" has always been an issue for me. If you've seen me teaching / talking during a class or a workshop, this "version" of me is akin to Jim 4.0. I haven't always been extemporaneously vocal. For me, this skill was added in my late 20's and early 30's.

Because of my "issues" with language, I've chosen to participate in the standards setting bodies and assist in the creation of clearly worded documents. Words mean something. Some words mean multiple things - yes English is a crazy language. I've tried to suggest words with single meanings, eliminating uncertainty and ambiguity in our documents.

To summarize der Kiureghian and Ditlevsen (2009), although there is no unanimously approved interpretation of the concept of uncertainty, in a computational or a real-world situation, uncertainty can be described as a state of having incomplete, imperfect and/or inconsistent knowledge about an event, process or system. The type of uncertainty in our documents that is of particular concern to me is epistemic uncertainty. The word ‘epistemic’ comes from the Greek ‘επιστηµη’ (episteme), which means 'knowledge.' Epistemic uncertainty is the uncertainty that is presumed as being caused by lack of knowledge or data. Also known as systematic uncertainty, epistemic uncertainty is due to things one could know in principle but doesn't in practice. This may be because a measurement is not accurate, or because the model neglects certain effects, or because particular data has been deliberately hidden.

With epistemic uncertainty in mind, I want to revisit the concept of "science" - as in "forensic science." The problem for me, for my autistic brain's processing of people's use of the term "forensic science" is that I believe that by their practice (in their work) they're emphasizing the "forensic" part (as in forum - debate, discussion, rhetoric) and de-emphasizing the "science" part. Indeed, do we even know what the word "science" means in this context?

According to Mayper and Pula, in their workshops on the epistemology of science as a human issue, there are four kinds of science:

Accepted Science - theories that are not yet refuted, after rigorous tests. Counter-examples must be accounted for or shown to be in error. Theories “tentative for ever”, but not discarded frivolously. Good replacements are not easily come by. A new theory must account for not only the data that the old theory doesn’t, but also all the old data that the old theory does.
Erroneous Science - theories that are not yet refuted, but are tested by false data:

Fake Science — scientist intentionally deceives others
Mistaken Science — scientist unintentionally deceives self (and others)

Pseudoscience - theories inconsistent with accepted science, attempts to refute them avoided or ignored, e.g. astrology, numerology, biorhythms; dowsing,
Fringe Science - theories inconsistent with accepted science, not yet refuted, but attempts to do so invited, e.g. Unified field theories (data accumulate faster than theory construction), Rupert Sheldrake’s “morphogenetic fields”, Schmidt’s ESP findings, etc.

I think a few of the popular practices in the "forensic sciences," like "headlight spread pattern analysis" and the merging of laser scans and CCTV images to measure items present in the CCTV footage that aren't present in the laser scan currently qualify as pseudoscience. Why? They haven't been validated. Questions about validation often are sidestepped and users focus on specific legal cases where the technique was employed and subsequently allowed in as demonstrative evidence. The way in which these two techniques are employed are inconsistent with accepted science - math, stats, logic ...

Indeed, to qualify as accepted science, these new techniques must "account for not only the data that the old theory doesn’t, but also all the old data that the old theory does." But, in doing so, must it also account for the existing "rules.?" For example, with Single Image Photogrammetry, the error potential (range) increases as the distance from the camera to the subject / object increases. The farther away the thing / person is from the camera, the greater the error or range of values (e.g. the subject is between 5'10" - 6'2"). Also, Single Image Photogrammetry needs a reference point in the same orientation as the subject / object. Additionally, it needs that reference to be close to the thing measured. As distance from the reference increases, the error potential (range) increases.

Thus, for "headlight spread pattern" analysis, what is the nominal resolution of the "pattern?" If the vehicle is in motion, how is motion blur mitigated? Given all of the variables involved and the nominal resolution within the target area (which is also variable due to the perspective effect), the "pattern" would rightly become a "range of values." If it's a "range of values," how can results derived from convenience samples be declared to "match" or be "consistent with" some observed phenomenon? Analysts are employing this technique in their work, but no validation exists - no studies, no peer-reviewed published papers, nothing. Wouldn't part of validating the method mean using the old rules - analytical trigonometry - to check one's work.

The same situation exists for the demonstrative aid produced by the mixed-methods approach of blending CCTV stills or videos (a capture of then) with 3D point clouds (a capture of now). The new approach must account for all the old data. To date, the few limited studies in this area have used ideal situations and convenience samples. None have used an appropriate sample of crappy, low priced DVRs in their studies. For example, Meline and Bruehs used a single DVR and actually tested the performance of their colleagues (link) not the theory that the measurement technique is valid. In their paper, they reference a study that utilized a tripod-mounted camera deployed in an apartment's living room to "test" the author's theory about accurate measurement. The author employed a convenience sample of about 10 friends and the distance from the camera to the subjects was about 10' - it was done in a living room in someone's apartment ffs.

I don't want to make the claim that the purveyors of these techniques are engaged in "fake science." I tend to think well of others. I think perhaps they're engaged in "mistaken science," unintentionally deceiving themselves and others.

We can reform the situation. We can insist that our agencies and employers make room their budgets for research and validation studies. We must publish the results - positive or negative. We must test our assertions. In essence, we must insist that the work that we perform is actually "science" and not "forensics" (rhetoric).

If you'd like to join me in this effort, I'd love to build a community of advocates for "accepted science." Feel free to contact me for more information.

Thursday, August 8, 2019

Customer Service?

It's no mystery that the company for which I worked after retiring from police service no longer exists. The former Nevada corporation known as Amped Software, Inc., is no more. My tenure ended as abruptly as the company at the end of February, 2019.

In spite of this, I still get calls daily from customers of that former company - customers who bought software, services, and support from that Nevada corporation - doing a little internet research to track down someone who will answer the phone (benefits of an unpronounceable Scandinavian last name, I guess). I'm happy to take their calls. I'm still able to answer UI / UX questions regarding the whole spectrum of technology used in digital / multimedia forensics. I can still offer training and consulting. What I can't do is sell Amped SRL's products or services. Neither can I comment officially on the inner workings of Amped SRL.

Why bring this up?

It seems with the latest update to FIVE, Amped SRL made yet another change to its EULA. Like the previous modification, it's a rather significant alteration of the agreement between the user and the creator of the software. This is important to an agency which requires the EULA to go through it's attorneys for vetting before software is purchased / installed (as is the case with many US-based agencies).

In essence, this agency's employee reached out to me for help / advice because he's facing discipline over having installed FIVE without first running the changed EULA by his agency's legal team for approval.

What he was hoping to verify was that there was no way he could have known in advance that from version to version, the EULA had changed.

My answer was a bit nuanced.

No, to the best of my knowledge, there is no place in the Amped website to find copies of the EULA.
No, to the best of my knowledge, Amped does not publicly announce (press release) a change in licensing terms.
No, to the best of my knowledge, Amped does not contact it's user base to announce that an update contains a new EULA.
Yes, the end user is presented with the EULA when installing the software. One could read it and make decisions to install / not to install based upon what was read.
But, to the best of my knowledge, if the end user decides not to install the software, no copy of the EULA is installed on the computer. Thus, you have to install the software in order to have a copy of the EULA to share with others.

It's a kind of no win situation for this particular end user. Yes, he installed the software. Yes, he had to in order to have a copy of the EULA. Yes, according to the terms in the EULA, he agreed to the EULA - binding him / his agency to it's terms upon installation.

Why is this an issue?

The material change in the new EULA changes the venue for grievances, and thus the applicable laws, from Nevada to New York. This particular agency exists within a county that has chosen to boycott New York state and it's business for political reasons. I can certainly sympathize. I remember when I was prohibited from traveling to several training opportunities in states with which California had a beef.

At this customer's agency, there was a contractual relationship between it and Amped Software, Inc., a Nevada corporation, to provide software and support. Now, the end user has agreed (by installing FIVE) to the ending of that relationship and an establishment of a new relationship with Amped Software USA, Inc., a New York corporation. The end user did this without permission of his chain of command.

Yes, this all sounds pedantic. It probably is. But, in civil service, everything's cool until it's not. As a Shop Steward, I've seen this type of activity many times. Someone has it out for you and is no longer willing to overlook trivialities like this. It may seem silly to you and I, but in this jurisdiction, doing business with a company in New York is anything but trivial.

Unfortunately for the end user, it gets worse.

FIVE is not "free software," that is sure. According to the GNU, "“Free software” means software that respects users' freedom and community. Roughly, it means that the users have the freedom to run, copy, distribute, study, change and improve the software. Thus, “free software” is a matter of liberty, not price. To understand the concept, you should think of “free” as in “free speech,” not as in “free beer”. We sometimes call it “libre software,” borrowing the French or Spanish word for “free” as in freedom, to show we do not mean the software is gratis." FIVE, under that definition is not free.

Neither is FIVE "open source software." Here's how the Open Source Initiative defines "open source."

FIVE is commercial software, commercial software that wouldn't be nearly as functional were it not for free / open source software. It's built with wxWidgets and OpenCV, open source development tools. It's web support is enabled by free communication tools like Gmail.

That PDF report is made possible by wkhtmltopdf, which is an open source command line tool used to render HTML into PDF using the Qt WebKit rendering engine ... which explains why HTML is the default reporting format.
The visual part of playback happens with the assistance of MPlayer and Mencoder, a free movie player / encoder that's been around for almost 20 years and is available under GNU GLP.
Decoding and playback of formats happens with the help of FFmpeg, FFMS2, and libde265 - doesn't everyone's these days?
The file information functions are supported by ExifTool and MediaInfo.

Why am I mentioning this? The end user found out, through an Internal Affairs inquiry, that these built-in tools all have their own license agreements that get installed (and thus agreed to) with the installation of FIVE. He submitted none of these license agreements to his purchasing department when requesting the software purchase. As a matter of course, the purchasing process asks the requestor for "all relevant license agreements" when requesting a software license purchase. Thus, in addition to the contracting issues, he's being accused of committing "lies by omission" for failing to disclose all relevant agreements - 11 counts in all, in addition to the issue with FIVE's EULA and his agreeing to it without permission. If you're curious, you can find the licenses FIVE's folder on your hard drive.

The customer allowed me to share this with you under the condition of anonymity. If all of this is news to you, and you see any of this as a problem, please consult your attorney / employee representative for advice. I'm not an attorney and this isn't legal advice. It's just an FYI to inform you of an issue of which you might not be aware.

Be well, my friends.

Wednesday, August 7, 2019

Where'd you get the 10?

When the "Search for Images on the Web" functionality was introduced into Amped's Authenticate some time ago, I asked a simple question of the development team, "where'd you get the 10?"

Amped SRL prides itself on operationalizing peer-reviewed published papers in image science. I assumed, wrongly it seems, that there was some science behind the UI's default setting for how many pictures you'd like to find. There isn't.

What I found out, at the time, is that the "Stop After Finding At Least N Pictures. N is" default of 10 is set to 10 for no particular reason whatsoever. My own opinion is that it's set to 10 because the developers are engineers and 10 - a one and a zero - looks nice. The 10 has no foundation in science / statistics as a valid number for that field. If you accept the default as presented, you're creating a convenience sample set that will give you more chances of being wrong than being right. Here's why.

The basic question being tested in the developer's example, with help from the "Search Images From Same Camera Model" dialog, is "match / no match." Does the evidence item match a representative sample of images from the same make / model of camera? You would perform this check when your evidence item's JPEG QT is not found in your tool's internal database (this is a known limitation of all software that is dependent upon an internal database). Another way of framing "match / no match" is a Generic Binomial Test.

Here's what the sample size calculation looks like for a generic binomial test performed in the criminal justice context (as opposed to the university / research context).

Analysis: A priori: Compute required sample size 

Input:  Tail(s)                   = Two

   Proportion p2             = 0.8

   α err prob                = 0.01

   Power (1-β err prob)      = 0.99

   Proportion p1             = 0.5

Output:  Lower critical N          = 19.0000000

   Upper critical N          = 40.0000000

   Total sample size         = 59

   Actual power              = 0.9912792

   Actual α                  = 0.0086415

A two-tailed test has a better ability to limit Type I and Type II errors vs. a one-tailed test. α and β error probability are set as low as possible, one chance per 100. This protocol yields a recommended sample size of 59. Not 10.

Around 20 samples, you have more chances of being wrong than being right. At 10 samples, you have about 8 chances in 10 of being wrong.

The "match / no match" scenario is quite different than an attempt to establish "ground truth," or what the camera "should be producing" when it creates an image. For these tests, a "performance model" is required and is generated by samples created by the device in question. Each camera will present it's own issues and the results of your test of the camera in question can't be applied to other cameras of the same make / model. You'll need to know a bit about the signal path - from light coming in to the resulting image's storage - to be able to determine the correct value for the "predictors" variable. In my last experiment, that value was 17 ... 17 different parts / processes that could possibly be in error. In that case, the sample size calculation was as shown below.

t tests - Linear multiple regression: Fixed model, single regression coefficient

Analysis: A priori: Compute required sample size 

Input:  Tail(s)                        = Two

   Effect size f²                 = 0.15

   α err prob                     = 0.01

   Power (1-β err prob)           = 0.99

   Number of predictors           = 17

Output:  Noncentrality parameter δ      = 4.9598387

   Critical t                     = 2.6099227

   Df                             = 146

   Total sample size              = 164

   Actual power                   = 0.9900250

In this case, I needed to generate 164 valid files for my sample set - not 10. The plot is shown below.

With only 10 samples for such a test, you would have 9 chances in 10 of being wrong.

Another scenario where the "official guidance" on sample sizes is quite off is with PRNU. The official guidance notes that the number of reference images in building a reference pattern is limited to 50, "(the number of images is limited to 50, which proved to be enough – we’ll also discuss this in a future post)." But, from the source documentation (link), the authors recommend more than that, "Obviously, the larger the number of images Np, the more we suppress random noise components and the impact of the scene. "Based on our experiments, we recommend using Np > 50" (link). The authors of the source document actually used 320 samples per camera in proving out their theories - not 10 or 50. Other researchers examining PRNU have used even larger sample sets (link) (link). These three links weren't chosen at random to attempt to reinforce my point. These are the references found in the processing report generated by Amped's Authenticate - further illustrating the point about validation of tools and methodologies. (Many thanks to Dr. Fridrich for maintaining a massive list of source documentation.)

But, as my old college football coach used to say, "the wrong way will work some of the time, the right way will work all of the time." Simply choosing a convenient number of samples may not be a problem in a justice system that has a 95% plea rate. But, as recent news points out, if you get caught out for bad methodology, all of your past work gets re-opened. Don't let this happen to you. Use sound methods. Get educated on the foundations of your work.

Remember, tools like Amped SRL's Authenticate don't render an opinion as to a file's authenticity - you do. Your opinion must be backed by a sound methodology, compliance with standards, and the fundamentals of science.

Correcting the lack of understanding of this vital topic was on the 2009 NAS Report's list of recommendations (link).

"The issues covered during the committee’s hearings and deliberations included:

(a) the fundamentals of the scientific method as applied to forensic practice—hypothesis generation and testing, falsifiability and replication, and peer review of scientific publications;
(b) the assessment of forensic methods and technologies—the collection and analysis of forensic data; accuracy and error rates of forensic analyses; sources of potential bias and human error in interpretation by forensic experts; and proficiency testing of forensic experts;
(c) infrastructure and needs for basic research and technology assessment in forensic science
(d) current training and education in forensic science; ..." pg. 3

10 years later, and vendors are still ignoring the NAS' recommendations, often providing incorrect information to customers.

The scientific method forms the foundation of all of the forensic science training offerings that I've created over the years. Illustrating error rates, where they appear in the work, and how to calculate and control for them can be found in all of our forensic science courses.

If you'd like to move beyond "push-button forensics," I hope to see you in class soon.