Featured Post

Welcome to the Forensic Multimedia Analysis blog (formerly the Forensic Photoshop blog). With the latest developments in the analysis of m...

Thursday, March 28, 2019

Calculating Nominal Resolution during Content Triage

In the image / video analysis workflow, the Content Triage step examines the file's contextual information from the standpoint of "can I answer the question with this file?"

The Content Triage step necessarily involves Frame Analysis. Part of Frame Analysis considers the calculation of the Nominal Resolution of the target area - face, shirt, tattoo, license plate, and etc. In this post, we'll consider license plates (aka number plates or registration plates) as our target.

Dimensionalinfo.com notes that in the US, standard license plate dimensions are six inches by twelve inches or approximately one hundred and fifty-two millimeters by three hundred and five millimeters.

Image source
In the United Kingdom for instance, the standard dimensions for registration plates are five hundred and twenty millimeters by one hundred and eleven millimeters for the front plates while the rear plates measure can be of the same dimensions or two hundred and eighty-five millimeters by two hundred and three millimeters.

Australia on the other hand, has standard dimensions of three hundred and seventy-two millimeters by one hundred and thirty-five millimeters or approximately fourteen and one-half inches by five inches.

The SWGDE Digital & Multimedia Evidence Glossary, Version: 3.0 (June 23, 2016), defines Nominal Resolution as "the numerical value of pixels per inch as opposed to the achievable resolution of the imaging device."  "In the case of digital cameras, this refers to the number of pixels of the camera sensor divided by the corresponding vertical and horizontal dimension of the area photographed."

Let's put this all together with an example from Australia. The question / request is: can we resolve the registration plate's characters on the white care, upper center left of the image shown below.

Ordinarily, you might be tempted to just say not, it's not possible. It's too small. Which of those few pixels do you want me to Photoshop into a registration plate? In this case, we'll attempt to quantify the value of the Nominal Resolution of the target area.

If the typical Australian registration plate is 372mm wide, and the target area is 8px wide, then the Nominal Resolution is ~46.5mm per pixel - meaning each pixel covers a width on target of ~46.5mm (about 1.8 inches). How many pixels wide are needed to resolve characters in a registration plate? My tests have shown results in a few as 5-6 columns of pixels wide can work at distances to target of under 15' for typical CCTV systems. Given that a pixel is generally thought of as the smallest single component of a digital image, you'll need more than a few to resolve small details in an image or video in order to answer questions of identification.

But, but, but ... the video system's specs note that the camera is HD and the recording is 1080p. The system's owner spent a lot of money on the tech. So what?!

Nominal Resolution deals with the target area, not the system's capabilities. The key is distance to target. The farther away from the camera, the lower the Nominal Resolution. The lower the Nominal Resolution, the lower the chance of successfully answering identification questions.

Thus, when responding to requests for service, it's a good idea to calculate the Nominal Resolution of the target area in order to quantify the pixel density of the region, adding the results to your conclusion. A statement such as, "unable to fulfill the request for information, re registration plate details, due to insufficient nominal resolution (~46.5mm per pixel)," is a lot more informative than "sorry, can't do it." If the available data supports no conclusion, adding the quantitative reasons for your conclusion will go a long way to supporting your determination.

If you'd like to know more about these types of topics, join me in an upcoming training session. Click here for more information.

Monday, March 25, 2019

Statistical Significance and Reporting Language

There's been a lot of talk lately about removing "statistical significance" from the reporting language of scientists, and forensic scientists. Many believe that the use of the term deliberately confuses the Trier of Fact and relies upon the fact that most are ignorant of statistics and their foundations.

The limitations of significance is one of the topics covered in Statistics for Forensic Analysts, my stats class that has migrated from the classroom to our on-line micro-learning portal.

Statistics form the foundation of much of what we do in

  • Forensic Video Analysis
  • Forensic Audio Analysis
  • Digital Forensics
  • Latent Print Examinations
  • Questioned Document Examinations
  • Toolmark / Firearms / Treadwear Examinations
  • Shooting Incident Reconstructions
  • Traffic Investigations / Recreations
With so many folks needing to gain knowledge / experience in statistics, offering the course on-line and allowing sign-up at any time allows us to get this valuable information out to as many people as possible in the shortest amount of time. No need to wait for the next class, no conflicting schedules. Just sign-up and begin learning.

Click here for more information, or to sign-up today.

Saturday, March 23, 2019

Quick File Triage

A random request via email prompts Round 2 of the product comparison - Input-Ace vs Amped FIVE. In this case, I received a .re4 file via Dropbox with the question - "what is license plate?" Well, not those words, but I distilled a few paragraphs down to the testable question.

Working the workflow, we conduct File Triage - can I even view this file? The person who emailed noted his failure at finding a way to play the file (he's likely the last person in the world who hasn't heard of Larry Compton's web site where you'll find three flavors of .re4 players). It's common to not want to install players / codecs, as I've said for years. I know that the .re4 format is just h264 video with a crazy proprietary container. Thus, for File Triage, I know that I can load it into Input-Ace or FIVE with no issue.

For File Triage of this random .re4 file - it's a tie(ish). Input-Ace takes about 30 seconds to decode and load the file. FIVE needs to "convert" the file, and takes about 45 seconds to create the "clean" file, the proxy, and the report.

The next step in the workflow is Content Triage, or "can I answer the question / satisfy the request with the given file?" I noted that the file's resolution was 2CIF. I also noted that the vehicle in question was about 50' from what was likely a 2mm-4mm lens. The pixel dimensions of the area likely containing a license plate were about 12x7px.

With every other line of resolution not recorded in the first place, and the target too far away from the lens, identification questions will not be answered with this file. The available video data will not support a conclusion. But, let's pretend that there is enough data. As this is a product comparison, let's run a Macroblock Analysis.

The file was about 500mb, containing about an hour of 2CIF footage. Again with the aid of a trusted analyst, fluent with Input-Ace, the set-up of the analysis as well as the generation of results took less than 30 seconds. The same file took FIVE 4.5 minutes to complete the task. We checked a select number of frames in each tool and found the results to be the same.

For Macroblock Analysis on this random .re4 file - score one for Input-Ace.

There were just a few lossless encoded blocks in the target area. With every other line of resolution missing and no usable data in the target area, there's no conclusion possible. All you're left with is "white car." Sure, you can attempt a Vehicle Determination Examination. You may get make / model / and range of model years. That might help narrow the search a bit.

Given that a majority of cases die at the Content Triage step, getting quickly and easily to and through that step is vital. With Input-Ace and this file, we were done with the case in less than 2 minutes. With FIVE and this file, the unexplained length of processing time for the Macroblock Analysis made it take more than double the amount of time to get to "no, sorry."

Of course, the rebuttal will be that FIVE contains so many more tools vs. Input-Ace - which is true.  But, who cares about fixing fisheye or rectifying the image if the question, "what is license plate," is unanswerable? 90%-95% of the time, folks just need a quick answer ... that's actually quick.

Now, I know what you're thinking. Amped's DVRConv is their product for the quick creation of proxy files. Again, you're correct(ish). DVRConv does not support all of the formats that FIVE does. In this case, loading the .re4 file into DVRConv caused the program to crash. If all you had was DVRConv, you'd be out of luck with this file.

As a final aside, I do like that Input-Ace reports the non-standard frames/second tag as "Unknown" (see the above graphic) as opposed to the FFMPEG default of 25fps (as FIVE does). Sure, I can restore the frame rate in both programs, but the untrained person may assume that 25fps is correct and not take steps to restore the proper rate. Remember, the triage steps aren't always performed by trained analysts. They're often performed ahead of the decision to engage with the analyst. [shameless plug - seats are open for our upcoming training sessions - click here for more info or to sign up]

That's going to about do it for the Input-Ace vs FIVE comparisons. By now, I think you get the points. Besides, there's a cool new plug-in from Chris Russ that deserves a look.

Wednesday, March 20, 2019

Redaction of Multimedia Evidence

Recent legal changes in the United States have impacted the way in which agencies handle video / multimedia evidence requests from the public. Complying with the new laws, and maintaining compliance with state and federal privacy codes requires agencies to manage the processing of requests quite carefully. Requests for records around a critical event can quickly swamp an agency’s staff given the new statutory time frames for the release of data.

For example, California Senate Bill 1421, Skinner (D. Berkeley), opens public access to internal investigations of police shootings and other incidents where an officer killed or seriously injured someone, as well to sustained findings of sexual assault and lying on the job.

Another example is California Assembly Bill 748, Ting (D. San Francisco), which requires police departments to release within 45 days audio or video footage of shootings or other incidents involving serious use of force, unless it would interfere with an active investigation.

With this in mind, agencies often have to get up to speed on the tech (and how to use it) fast. But, they don't always have a budget to send people to training - or to bring an instructor to their agency.

Redaction actually lends itself quite nicely to the on-line micro-learning model. That's why I've designed a curriculum for redaction using the most popular software platforms from Adobe, Amped, Audacity, and Vegas Pro. Because it's micro-learning, folks can take up to 60 days to complete the course - meaning they can fit it into their busy schedules. Also, the courses are reasonably priced.

  • AL110 - Redaction for Standards Compliance - featuring the Adobe Creative Suite tools. 
  • AL111 - Redaction for Standards Compliance - featuring Magix Vegas Pro 16 + Audacity. 
  • AL112 - Redaction for Standards Compliance - featuring Amped FIVE + Audacity.
  • AL113 - Redaction for Standards Compliance - featuring Adobe, Amped, Audacity, and Magix tools.

AL110, AL111, and AL112 (registration fee = $249 per student) are for those agencies that have already decided on a platform and want to save money - training on a single software solution. AL113 (registration fee = $349 per student) is for those that haven't yet decided on a platform and want information / instruction on the most popular software solutions.

These multimedia redaction courses will get the agency and their redaction staff up to speed on the issues and the technology necessary to bring the agency into compliance with the new laws.

Head on over to the web site for more information and sign up today.

Tuesday, March 19, 2019

New courses coming this week

I'm putting the finishing touches on the redaction courses that I've been working on. I've presented redaction (audio / video) as in-person training for years. This week, I'll release the micro-learning versions of the courses.
  • AL110 - Redaction for California SB1421 / AB748 Compliance - featuring the Adobe Creative Suite tools.
  • AL111 - Redaction for California SB1421 / AB748 Compliance - featuring the Magic Software tools (Vegas Pro 16 + Sound Forge Audio Cleaning Lab).
  • AL112 - Redaction for California SB1421 / AB748 Compliance - featuring the Amped FIVE and Audacity.
  • AL113 - Introduction to Redaction of Multimedia Evidence - features processing in Adobe, Amped, Audacity, and Magix software products.
Whilst AL110, AL111, and AL112 were designed with the new redaction laws in California in mind, they're a lower cost option for those agencies who want to focus on a single software package. The California laws require a fast turnaround of redaction requests whilst forbidding the agency from charging the requestor or seeking reimbursement from the state. With this in mind, the workflow is geared towards efficiency and speed. As such, agencies / employees around the world can benefit from this course.

AL113 is perfect for those agencies / employees exploring the concept for the first time as well as those looking at the various software packages available for off-line redaction. 
  • Adobe and Magix products are ubiquitous in the marketplace, enjoying a massive customer base. For off-line redaction, they are cost effective and relatively simple to use.
  • Amped FIVE was chosen as many agencies have purchased it for redaction and analysis. It's probably the fastest manual redaction tool on the market, even though it's not a redaction tool as such (it's an analysis platform that contains redaction functionality). FIVE doesn't have the ability to redact audio. Thus, I've paired it with Audacity, an amazingly easy to use freeware audio editor.
For all of our training offerings, visit our training page by clicking here. At your location, at our location, or on-line - think of us for all your training needs.

Friday, March 15, 2019

The Tilde - a most profound character

Every single character in a technical specification means something. Sometimes, the character is rather significant. The Tilde (~) is one of those rather significant characters often found in DVR / NVR spec sheets.

Take this specification from a typical Dahua DVR.

The video frame rate is an approximate value (e.g. ~30f/s for NTSC). It's not an exact value. It's a $7 box of random parts, not a Swiss chronometer. Thus, the folks at Dahua are hedging by saying - we're going to try to get each channel to the specified frame rate, but don't hold us to it.

We know that the Tilde does not mean "between" in this document. The clue is in the bit rate. The bit rate is given as a range. For the 4-Ch version, the rate is 2048Kbps-4096Kbps. Read aloud, it's "between 2048Kbps and 4096Kbps."

Thus, in building a performance model of a specific DVR for case work, one will need to find the range of values in which the approximate FPS for the specific DVR falls.

Basic condition of the DVR

For the 4-Ch, what happens inside the evidence (FPS / camera) DVR when:

  • A single channel is recording normally but none of the others is recording (no motion detected / no alarm). (test for each channel)
  • Two channels are recording normally and two are not. (test for each combination of two channels)
  • Three channels are recording normally and one is not. (test for each combination of three channels)
  • All four channels are recording normally.
  • All four channels are set to record on motion only. Repeat the above but instead of pressing the record button for normal recording, cause a motion trigger recording.
  • If alarm functionality is present: all four channels are set to record on alarm only. Repeat the above but instead of pressing the record button for normal recording, cause an alarm trigger recording.
How reliable is the Basic Condition? Better put, how variable are the results? Given the signal path inside the average Chinese DVR, like Dahua, will the four channels feed into a single encoder? How does the signal path prioritize data when all channels are feeding it data? Same question for when the data is headed across the bus to the HDD? What does it's error state look like? To find out, you need a proper / valid sample set of tests. How big will that sample set be? How does 164 complete run throughs of the above scenario work for you?

Usually, when a number this large is quoted, folks will complain about the cost. But, there's another cost to factor - the cost of being wrong. You see from the graph below that under about 50 test sets, your chances of being wrong are greater than your chances of being right. Put another way, under 50 test sets, you're better off flipping a coin. This is something, the calculation of sample sizes, that I cover in my class - Statistics for Forensic Analysts.

Back to the evidence file. I've seen some DVRs insist upon producing a container that had an even distribution of 30 FPS every time. But, an examination of the frames within that container proved that many of the frames within a GOP were exact duplicates of other frames. Meaning? The machine got overwhelmed and padded the container such that it could maintain the specified rate? For this determination, frames were extracted from the container and Similarity Metrics were calculated. Correlation coefficients were 1 - match, no difference.

All of this information is a bit expensive in time and technology to discover. But, it's vital to know this information in certain types of cases involving video - reconstruction of police uses of force, traffic investigations, and etc.

If you'd like to know more, shoot me an email.

Wednesday, March 13, 2019

A few quick comments about hash values and DVR files

By now, it's well known that the world of DVR forensics and computer forensics kind of looks the same, but really isn't quite the same. Answers to questions regarding DVR forensics often start with "it depends."

In the world of digital forensics, every item of evidence should / must be hashed upon receipt. And ... the same is true for DVR files in the world of forensic video analysis. You should hash all files upon receipt.

You receive a couple of .264 files from a crime scene, you hash them, then you make copies, and begin your work. If you're working a computer forensic style of workflow, then your copies are "true and exact copies of the originals." But, you're working a forensic video analysis type workflow. Your copies are not true and exact copies of the originals - they're proxy files. Proxy means substitute / stand-in. They're "converted." The process of creating the proxy - re-wrapping the data stream, carving out the time stamp ... all of those convenient things we do to make our lives easier ... means that the resulting proxy file will not have the same hash value as the original from the crime scene. And ... that's OK. That's normal.

Using a tool like FTK imager, one can hash a whole project folder. In the folder are the original files, the proxy files, any other derivative files, etc. A quick examination of the hash values clearly shows that the hash of the original .264 files (green) does not match the hash of the .avi proxy files (red). Again, that's OK. That's normal.

As forensic video analysts, we need to know that (a) this is happening and (b) it's normal. If you have to create a proxy file in order to work and respond to the request, this mismatch of the hash values will happen. Just hash the resulting files in the working folder to keep a record of them as they were created. Pull a still frame out as a separate file? Hash it. Carve out the time stamp as a separate file? Hash it.

As worlds converge, there will be a bit of sorting out of procedures and dialog. It's OK.

Saturday, March 9, 2019

Simple image and video processing

I'm back doing product reviews and comparisons. Yay!

In the blog post announcing my joining the Amped team a few years ago, its CEO noted "Jim is well known also for his direct, unfiltered and passionate style."

It's been a while, but the directness and passion are still here; and I have yet to find a filter. ;) ... and, in case you missed it, I'm no longer part of the Amped team. More on that in a future post.

Today's test - Input-Ace vs Amped FIVE for simple image / video processing. *

To facilitate the test, I've enlisted the help of an analyst from a private lab with Input-Ace and FIVE. I just needed a few stills and screen shots to work with. The test begins with a video extracted by me from a black box 2CIF DVR, the kind that are rather ubiquitous here in the US. It's one of my test/validate files, so I know the values that we are starting with.

The task: "I just need a still image of the vehicle for a BOLO." This should be a 2 minute process .. or less.

Now, the process and tools are a bit different when working in Input-Ace vs Amped FIVE. But, I devised a test of something I used to do several times per day at the LAPD - process 2 CIF video for a flyer. Resize / Aspect Ratio - problems of resolution can be fixed in either tool.

My experiment is two-fold.
  1. How fast / easy is it to load a video, fix the resolution issue (restore the missing information that happens when the DVR is set to ignore every other line of resolution - 2 CIF), and output a still for a BOLO.
  2. What's the difference in quality between the two processes? Is there a difference in the results? If so, what is it?

The "workflow engine" way of working is not natural to me. But, my friend is rather proficient with the tool and noted that fixing the resolution issue was a two-step process - first restore the aspect ratio and then restore the size. Each step in Input-Ace utilized  Nearest Neighbor interpolation. The time to configure the filters and output a still was less than 30 seconds.

Nearest neighbor simply copies the value of the closest pixel in the position to be interpolated. (Anil. K. Jain, “Fundamentals of Digital Image Processing”, Prentice Hall, pp. 253–255, 320–322, 1989. ISBN: 0-13-336165-9.)

For FIVE, this solution is equally easy - the Line Doubling filter (Deinterlace > Line Doubling). Line Doubling utilized a Linear interpolation. As with Input-Ace, the time to configure the filters and output a still was less than 30 seconds.

Linear interpolates two adjacent lines. (E. B. Bellers and G. de Haan, “Deinterlacing - An overview”, in Proceedings of the IEEE, Vol. 86, No. 9, pp. 1839–1857, Sep. 1998. http://dx.doi.org/10.1109/5.705528.)

In terms of speed and ease of use - it's a tie. 

Remember, this task isn't an "analysis" as such. This process is one of those common requests - we just need an image, quickly, for a BOLO. But, you want to fix the problems with the file before giving the investigator their image. Sending out a 2 CIF image without correcting / restoring the resolution could lead to problems with recognition as the images appear "squashed."

Next, I wanted to know if there was a qualitative difference between the two resulting files. This is where FIVE excels - analysis. FIVE's implementation of Similarity Metrics (Link > Video Mixer) was used.

The results:
  • SAD (Sum of Absolute Differences - mean) (0 .. 255): 2.0677.
  • PSNR (Peak Signal to Noise Ratio - dB): 28.7395.
  • MSSIM (Mean Structural Similarity) (0 .. 1): 0.9335.
  • Correlation (Correlation Coefficient.) (-1 .. 1): 0.9895.
A rather definitive result. As regards the correlation coefficient, a value of exactly 1.0 means there is a perfect positive relationship between the two variables. A value of -1.0 means there is a perfect negative relationship between the two variables. If the correlation is 0, there is no relationship between the two variables. A correlation coefficient with an absolute value of 0.9 or greater represents a very strong relationship. In this case, the value is 0.9895 ... or very nearly 1. The other results can confirm quantitatively what your eyes can see qualitatively - to the eye, the results are virtually identical. Same truck. No visual loss of details.

From a qualitative standpoint - it's a tie.

Thus, if the two pieces of software can deliver what is visually the same result, in the same amount of time, then what's the tie breaker? Features? Service? Price?

The tie breaker is for you to decide. What features are "must haves?" What are the terms of service? Are they acceptable to your agency? What's the better value for your agency, your caseload, and your workflow?

For features, does it matter if there are 130 filters / tools if you only use about a dozen of them on a regular basis? I'm in a different place in my casework now - more "analysis" than "processing." For me, the value proposition based on features still tilts towards Amped's tools. Besides, I've had my license for years now. I'm not coming at the tool for the first time. For many at the local police level, it's more processing than analysis - with analysis being done at the prosecutor's office. Input-Ace as a production tool is quite well positioned. As an analysis tool, you'll need something else for now. The testimony of Input-Ace's primary evangelist, Grant Fredericks, will confirm that it's major part of his tool-set, but not the only tool he uses.

For the terms of service, examine each product's End User License Agreement - otherwise known as "that thing you don't read as you install the software." The EULA is the company's promise to you - the terms of what you're getting. Are the terms acceptable? If you're in North America, can you get someone on the phone to help during business hours? Input-Ace vs Amped contact pages are linked here for your convenience. Are you OK with a web portal as your only option?

For price, it can only be published price vs. published price. I am told that the quoted price for an annual subscription of Input-Ace is US$3k. The generally quoted price of a perpetual license of FIVE is EUR 9000. The EUR / USD exchange has been quite volatile of late, so the dollar price has come down a bit under US$11k on the exchange. I believe that a perpetual license of Input-Ace is available, but I don't have information on that price ... nor do I know an agency that has gone that route.

Then there's the ordering component of price. Price doesn't matter if you can't order the product. Can your agency pre-pay for goods via an office shared in Brooklyn, NY, with 40+ other entities at any price? States like New Jersey and Illinois forbid such pre-payment explicitly. Counties and cities like Los Angeles forbid pre-payment practically. Did you check up on the business? Run a Bizapedia search on the business. Do you get a result? Do you recognize the names? Does anything look odd? FIVE's EULA indicates that support will be done via the web portal, not at their "NY office." Are those terms acceptable? Then check the provider of the competition. Do the same thing. Run a Bizapedia search. Recognize any names? It may not be your money that your agency is spending, but due diligence is necessary nonetheless.

I'm of the firm belief that any true comparison of products must include the total experience - comparing the features as well as the user / customer experience. Features are pretty straightforward. User / customer experiences vary from person to person.  I've shared mine with you. Feel free to share yours with me.

*An important note: this was a simple case study. It's results were valid for what specifically was studied. It's not meant to validate either tool for use on a particular case, or your particular case. The opinions expressed herein are those of the author alone.

Friday, March 8, 2019

The return of product reviews and comparisons

Some of the things that I haven't been able to do for some time are product reviews and comparisons. First, the blog was all but shut down by my LAPD CO. Then, in my LE retirement, I was able to do product features when I was employed at Amped Software, Inc., to be sure. But, I was never allowed to compare / contrast Amped SRL's tools with others available on the market. Now that I'm on my own, I can pick up where I left off. There's a lot of catching up to do. Well, it's time ...

First out of the gate: Input-ACE vs Amped FIVE.

I will examine some common workflows and file processing challenges. I won't hold anything back.

Stay tuned.

Thursday, March 7, 2019

Calculating the cost of redacting police body worn video

There's been an interesting back and forth over the release of body camera video and the cost of redaction happening in Washington DC. As someone who teaches and performs redactions, I've been following it with some interest.

Here's the source document that I'll be referencing from DC Transparency Watch.

I first got wind of this from a tweet. Here's a copy:

It seems that an elected neighborhood representative wanted DC Metro to release "the video" of the incident. What's not in dispute is that 7 officers arrived to find 3 juveniles, resulting in 229 minutes of footage. The elected official's tweet certainly indicates her displeasure at being invoiced for services needed to fulfill her request.

Whilst not every state allows agencies to charge for the production of redacted copies of video / audio, it seems that the Federal City does.

To some, $5,387.00 dollars might seem a bit much. But, is it a fair charge? How did the MPD come up with this rather precise number? MPD's initial response to the request was an invoice indicating a charge for 229 minutes at $23 per minute. Again, where does this number come from? Is it fair?

There's some interesting back and forth on social media about this, like the email reply shown above. But, I was more concerned with the cost calculation.

As I teach redaction, I use a job costing formula that looks like this:

( ___ [total footage time per recording] x ____ [processing time per minute of recording]) x ___ [total subjects or objects to redact] = ___ minutes x ___ recorders = ___ minutes to complete the task. Total time x ___ $/hr (employee cost) = ____ total cost.

The total footage time, in this case, gets divided by the total number of officers on scene with BWCs. The processing time per minute of recording varies based upon what task is being performed. In the standards compliant workflow that I teach, these tasks are Triage, Redaction Task, and Redaction Quality Assurance. Each of these steps use the same formula, but the processing time per minute varies - Triage is the longest, then QA, then the actual redaction task. There's case management and supervisory review to factor in as well.
Note: (I do give the complete formula and all of the variables in class)

Based on the above formula, I calculated that it would take about 311 hours to complete the request. A police records clerk in DC earns between $38k and $45k per year (source). Many states also factor in the total employment cost of the employee, which includes pension and benefits. Also considering that the supervisor most likely earns more, I calculated the average hourly rate to be $26/hr. With this in mind, the cost to fulfill this request would be $8,089.38.  Not far off from what the representative was invoiced.

Assuming that MPD is overworked and that likely some of the workflow was skipped, and assuming that the agency just billed the average employee cost at salary without benefits and pension, the numbers were adjusted. I eliminated the QA step and adjusted the hourly rate to $22/hr. The new totals look like this: 231.01 hours x $22/hr = $5,082.22 - right about where the MPD ended up with their invoice.

I don't intend to weigh in on the politics of one branch of the same government charging another branch. I mean only to illustrate the rational basis of setting costs for redaction of body camera video - or any video that is in police custody and subject to privacy restrictions before public release.

Cost estimation is a necessary step in the process.
  • Supervisors need to know how long it will take to perform each step in the process in order to know whom to assign the work.
  • The process needs to know how long it will take such that the requestor can have an estimated completed date (most statutes require notification if the turn around will be longer than 7 days).
  • Agencies need to know how long these requests take to fulfill such that appropriate staffing levels are maintained.
  • When agencies evaluate technology for redaction, they need a baseline calculation in order to gauge the proposed cost/time numbers.
This exercise is just a peek at what is covered in our Redaction courses. We offer a course specific to California's new laws, a course that covers the issue in the generic (both legal and technical) and is thus applicable all over, as well as courses that cover the general legal environment combined with  the specific technology used for the redaction tasks - Adobe, Amped (with Audacity), or Magix. We offer them either on-site (ours or yours) or on-line. Click here for more information. We'd love to see you in one soon.

Wednesday, March 6, 2019

Evaluating Research

As analysts, we rely upon research to form the basis of our work. If we're conducting a 3D measurement exercise utilizing Single View Metrology, as operationalized by the programmers at Amped SRL, we're relying not only upon the research of Antonio Criminisi and his team, but also the programmers who put his team's work into their tool. We trust that all those involved in the process are acting in good faith. More often than not, analysts don't dig around to check the research that forms the foundation of their work.

In my academic life, I've conducted original research, I've supervised the research of others, I teach research methods, I've acted as an anonymous peer-reviewer, and I participate in and supervise an Institutional Review Board (IRB). In my academic life, as well as in my professional life as an analyst, I use the model shown above to guide my research work.

For those that are members of the IAI, and receive the Journal of Forensic Identification, you may have noticed that the latest edition features two letters to the editor that I submitted last summer. For those that don't receive the JFI, you can follow along here but there's no link that I can provide to share the letters as the JFI does not allow the general public to view it's publications. Thus, I'm sharing a bit about my thought process in evaluating the particular article that prompted my letters here on the blog.

The article that prompted my letter dealt with measuring 3D subjects depicted in a 2D medium (Meline, K. A., & Bruehs, W. E. (2018). A comparison of reverse projection and laser scanning photogrammetry. Journal of Forensic Identification, 68(2), 281-292), otherwise known as photogrammetry.

When evaluating research, using the model shown above, one wants to think critically. As a professional, I have to acknowledge that I know both of the authors and have for some time. But, I have to set that aside, mitigating any "team player bias" and evaluate their work on it's own merits. Answers to questions about validity and value should not be influenced by my relationships but by the quality of the research alone.

The first stop on my review is a foundation question, is the work experimental or non-experimental? There is a huge difference between the two. In experimental research an independent variable is manipulated in some way to find out how it will affect the dependent variable. For example, what happens to recorded frame rate in a particular DVR when all camera inputs are under motion or alarm mode? "Let's put them all under load and see" tests the independent variables' (camera inputs) potential influence on the dependent variable (recorded file) to find out if / how one affects the other.  In non-experimental research there is no such manipulation, it's largely observational. Experimental research can help to determine causes, as one is controlling the many factors involved. In general, non-experimental research can not help to determine causes. Experimental research is more time consuming to conduct, and thus more costly.

With this in mind, I read the paper looking for a description of the variables involved in the study - what was being studied and how potential influencing factors were being controlled. Finding none, I determined that the study was non-experimental - the researchers simply observed and reported.

The case study featured a single set of equipment and participants. The study did not examine the outputs from a range of randomly chosen DVRs paired with randomly chosen cameras. Nor did the study include a control group of participants.  For the comparison of the methods studied, the study did not feature a range of laser scanners or a range of tools in which to create the reverse projection demonstratives. No discussion as to the specifics of the tool choices was given.

For the participants, those performing the measurement exam, no random assignment was used. Particularly troubling, the participants used in the study were co-workers of the researchers. Employees represent a vulnerable study population and problems can arise when these human subjects are not able to properly consent to participating in the research. As an IRB supervisor, this situation raises the issue of potential bias. As an IRB supervisor, I would expect to see a statement about how the bias would be mitigated in the research and that the researchers' IRB had acknowledged and approved of the research design. Unfortunately, no such statement exists in the study. Given that the study was essentially a non-experimental test of human subjects, and not an experimental test of a particular technology or technique, an IRB's approval is a must. One of the two letters that I submitted upbraided the editorial staff of the JFI for not enforcing it's own rules as regards their requirement for an IRB approval statement for tests of human subjects.

Given the lack of control for bias and extraneous variables, the lack of randomness of participant selection, and the basic non-experimental approach of the work, I decided that this paper could not inform my work or my choice in employing a specific tool or technique.

Digging a bit deeper, I looked at the authors' support for statements made - their references. I noticed that they chose to utilize some relatively obscure or largely unavailable sources. The chosen references would be unavailable to the average analyst without the analyst paying hundreds of dollars to check just one of the references. In my position, however, I have access to a world of research for free through my affiliations with various universities. So, I checked their references.

What I found, and thus reported to the Editor, was that many times the cited materials could not be applied to support the statements made in the research. In a few instances, the cited material actually refuted the authors' assertions.

In the majority of the cited materials, the authors noted that their work couldn't be used to inform a wider variety of research, that case-specific validity studies should be conducted by those engaged in the technique described, or that they were simply offering a "proof-of-concept" to the reader.

In evaluating this particular piece of research, I'm not looking to say - "don't do this technique" or "this technique can't be valid." I want to know if I can use the study to inform my own work in performing photogrammetry. Unfortunately, due to the problems noted in my letters, I can't.

If you're interested in engaging in the creation of mixed-methods demonstratives for display in court, Occam's Input-ACE has an overlay feature that allows one to mix the output from a laser scan into a project that contains CCTV footage. The overlay feature is comparable to a "reverse projection" - it's a demonstrative used to illustrate and reinforce testimony. A "reverse projection" demonstrative is not, in and of itself, a measurement exercise / photogrammetry. Though it is possible to set up the demonstrative, then use SVM to measure within the space. If one wants to measure within the space in such a way as a general rule (not case specific), proper validity studies need to be conducted. At the time of the writing of this post, no such validity studies exist for the calculation of measurements with such a mixed measures approach. If one is so inclined, the model above can be used to both design and evaluate such a study. Until then, I'll continue on with Single View Metrology as the only properly validated 3D measurement technique for 3D subjects / objects depicted within a 2D medium.

Monday, March 4, 2019

What is Analysis?

What is analysis?

a·nal·y·sis [əˈnaləsəs] - NOUN
     analyses (plural noun)

  • detailed examination of the elements or structure of something. "statistical analysis" · "an analysis of popular culture"
synonyms: examination · investigation · inspection · survey · scanning · study · scrutiny · perusal · exploration · probe · research · inquiry · anatomy · audit · review · evaluation · interpretation · anatomization

  • the process of separating something into its constituent elements. Often contrasted with synthesis. "the procedure is often more accurately described as one of synthesis rather than analysis"

synonyms: dissection · assay · testing · breaking down · separation · reduction · decomposition · fractionation
antonyms: synthesis

Forensic science is the systematic and coherent study of traces to address questions of authentication, identification, classification, reconstruction, and evaluation for a legal context.  (Source: A Framework to Harmonize Forensic Science Practices and Digital/Multimedia Evidence. OSAC Task Group on Digital/Multimedia Science. 2017)

What is a trace? A trace is any modification, subsequently observable, resulting from an event. You walk within the view of a CCTV system, you leave a trace of your presence within that system.

Thus, forensic video analysis (or forensic multimedia analysis) can be seen as a systematic and coherent examination of video (multimedia) traces (elements) to address questions of authentication, identification, classification, reconstruction, and evaluation for a legal context.

In the former definition, we can see the quantitative nature of analysis. The latter definition reveals it's potential qualitative elements.

In a quantitative data analysis, things are stable, controlled - facts can be obtained (facts are measurable / objective). In a qualitative data analysis, things are dynamic. Your role as an observer may influence the analysis. What is "true" depends on the situation & setting (truths are things we "know" - subjective). A quantitative study is controlled. A qualitative study is observed.

A qualitative study's purpose is to describe or understand something. The purpose of a quantitative study is to test, resolve, or predict something (e.g. in order to use a DVR to determine speed of an object within it's derivative video files - results will be a range of values, one must resolve how the DVR creates files "typically" through a controlled series of tests).

The analyst's viewpoint during a quantitative study is logical, empirical, deductive (conclusion guaranteed). In a qualitative study, it's situational and inductive (conclusion merely likely) or abductive (taking one's best shot). Performing a comparative analysis with convenience samples is an example of taking one's best shot. A quantitative study would feature an appropriate sample size calculation and note any limitations that arose as a result of not being able to achieve the appropriate samples.

From a contextual standpoint, a quantitative's context is not taken into consideration but rather controlled via methodological procedures. In this way, potential bias is mitigated. In a qualitative study, context matters - values, feelings, opinions, individual participants matter.

In a quantitative study, the analyst seeks to solve, to conclude, or to verify a predetermined hypothesis. With a qualitative study, the orientation changes - seeking rather to discover or explore. This can occur often in investigations - new information developed leads to changes in the direction of the investigation as things / people are ruled-in / ruled-out.

In a quantitative analysis, the inputs and results are numerical - data is in the form of numbers / numerical info. A qualitative analysis is narrative in nature - data is in the form of words, sentences, paragraphs, notes, or  pictures / graphics / etc.

After conducting a quantitive analysis, one's results / findings can be generalized to other populations or situations. The results of a qualitative analysis are case specific, particular, or specialized.

With all of this in mind, what is analysis? What type of analysis are you conducting? What type of analysis are you reporting? When analyzing the work of other analysts, what type of work are they conducting / reporting?

You can use this dialog to build a template / matrix. In reviewing work, examine the elements above to determine if the work is quantitative or qualitative. For example, you're reviewing an analyst's work in on a measurement request (photogrammetry). The results section features a picture that has been marked up with arrows and text. No methodology is discussed. These results would be considered qualitative. If the results section featured a conclusion, a range of values, error estimation, and a reference / methodology section, it could be considered quantitative. You could take the analyst's data and reproduce their study - which is not possible from an annotated picture.

The elements for a quantitative analysis described above, when reported back to the Trier of Fact, help ensure that you've maintained standards compliance (ASTM E2825-18). Rhetorical or narrative statements are fine for the introductory section of your report - a summary of the request - but are not sufficient for supporting a conclusion or describing one's processes.

If you'd like to know more, join me in an upcoming training session. For more information or to sign up, click here.