Electronics > Beginners
Took the Plunge - GoldStar OS-7020 Oscilloscope
james_s:
Ah so it sounds like you're dealing with someone elses failed repair, you're going to have to look very carefully because there's no telling what else they messed with in there. I would change out that 1M resistor for the proper 3M or three 1M resistors in series, that could be increasing the load significantly on the HV which is not able to supply much current. Also clean off the flux, some types of flux are conductive and that also can cause problems with high voltage and high impedance circuits.
The neon lamps are likely functioning as spark gaps, their purpose is to protect the electronics in case an arc occurs in the CRT. It's fairly common to see that arrangement on older CRT displays.
Brumby:
--- Quote from: Brumby on January 16, 2020, 10:58:28 am ---Hope this will help you out. (I don't think I missed anything :D )
https://www.dropbox.com/s/rybj78uwiksgupo/Goldstar_OS-7020_Service_Manual.pdf?dl=0
File size is around 19MB - but I feel this is on the high side. If anyone can recommend some PDF wrangling software that might deliver a smaller file size, I'd be interested.
I've presented it as it was in the hardcopy, so viewing on screen should be fine, but printing it out as is might not give you what you expect.
--- End quote ---
PDF has been updated. It is now searchable and just under 11MB in size. The new version has just been uploaded to KO4BB and Dropbox.
Dropbox link is: https://www.dropbox.com/s/m8robnt2kxczc9g/Goldstar_OS7020_Oscilloscope_Service_Manual_searchable.pdf?dl=0
In a moment of weakness I spent some real money on PDF software.
AVGresponding:
--- Quote from: Brumby on February 02, 2020, 10:59:21 am ---
--- Quote from: Brumby on January 16, 2020, 10:58:28 am ---Hope this will help you out. (I don't think I missed anything :D )
https://www.dropbox.com/s/rybj78uwiksgupo/Goldstar_OS-7020_Service_Manual.pdf?dl=0
File size is around 19MB - but I feel this is on the high side. If anyone can recommend some PDF wrangling software that might deliver a smaller file size, I'd be interested.
I've presented it as it was in the hardcopy, so viewing on screen should be fine, but printing it out as is might not give you what you expect.
--- End quote ---
PDF has been updated. It is now searchable and just under 11MB in size. The new version has just been uploaded to KO4BB and Dropbox.
Dropbox link is: https://www.dropbox.com/s/m8robnt2kxczc9g/Goldstar_OS7020_Oscilloscope_Service_Manual_searchable.pdf?dl=0
In a moment of weakness I spent some real money on PDF software.
--- End quote ---
Please excuse me if I'm missing something obvious here, but couldn't you just use a compression tool on the file?
Whales:
--- Quote from: ThickPhilM on February 02, 2020, 04:20:15 pm ---
--- Quote from: Brumby on February 02, 2020, 10:59:21 am ---
--- Quote from: Brumby on January 16, 2020, 10:58:28 am ---[...] If anyone can recommend some PDF wrangling software that might deliver a smaller file size, I'd be interested.
--- End quote ---
--- End quote ---
Please excuse me if I'm missing something obvious here, but couldn't you just use a compression tool on the file?
--- End quote ---
Compression background
To be able to answer both of these questions: some background about compression is needed.
There are lossless compression techniques (eg .zip, .rar, .gz, .lzma, .7z, png) and lossy compression techniques (jpeg, most video codecs [h264, mpeg*], most audio codecs [opus, mp3, aac]). Lossy is a lot better at saving space, but it sacrifices some details.
Lossy is pretty much a default expectation of modern computing and society, with lossless only used where any loss would be inappropriate (eg text files, written documents).
Lossy compressors are very situation specific so I won't go into their details here. Most have variable control: more compression (more detail loss) or less compression (less detail loss). You may have encountered this when saving as JPEG.
Lossless compressors work by looking for repeating patterns in your files and replacing these repeating patterns with just one copy (plus some references). This is how archive formats like .zip, .rar and .7z work.
Regardless of what compression technique you use: the data bits and bytes in the output compressed file should look mostly indistinguishable from random data. This means that there are no obvious patterns left that can be exploited for further compression. ie compressing a file multiple times will not make it smaller (with some technical exceptions), only one layer of compression should be applied or needed (again with some weird exceptions, not relevant here).
Initial inspection
Now let's look at the specific PDF file: Goldstar_OS-7020_Service_Manual.pdf (19MiB). I'll start my inspecting by extracting all of the images using pdfimages (part of poppler-utils):
--- Code: ---$ pdfimages ../GoldStar.pdf -all ex
$ ls
ex-000.jp2 ex-039.jpg ex-078.jpg ex-117.jpg ex-156.jpg ex-195.jpg ex-234.jpg ex-273.jpg
ex-001.jpg ex-040.jpg ex-079.jpg ex-118.png ex-157.jpg ex-196.jpg ex-235.jpg ex-274.jpg
ex-002.jpg ex-041.jpg ex-080.jpg ex-119.jpg ex-158.jpg ex-197.jpg ex-236.jpg ex-275.jpg
ex-003.jpg ex-042.jpg ex-081.jpg ex-120.jpg ex-159.jpg ex-198.jpg ex-237.jpg ex-276.jpg
ex-004.jpg ex-043.jpg ex-082.jpg ex-121.jpg ex-160.jpg ex-199.jpg ex-238.jpg ex-277.jpg
ex-005.jpg ex-044.jpg ex-083.jpg ex-122.jpg ex-161.jpg ex-200.jpg ex-239.jpg ex-278.jpg
ex-006.jpg ex-045.jpg ex-084.jpg ex-123.jpg ex-162.jpg ex-201.jpg ex-240.jpg ex-279.jpg
ex-007.jpg ex-046.jpg ex-085.jpg ex-124.png ex-163.jpg ex-202.jpg ex-241.jpg ex-280.jpg
ex-008.jpg ex-047.jpg ex-086.jpg ex-125.jpg ex-164.png ex-203.jpg ex-242.jpg ex-281.jpg
ex-009.jpg ex-048.jpg ex-087.jpg ex-126.jpg ex-165.jpg ex-204.jpg ex-243.jpg ex-282.jpg
ex-010.jpg ex-049.jpg ex-088.jpg ex-127.jpg ex-166.jpg ex-205.jpg ex-244.jpg ex-283.jpg
ex-011.jpg ex-050.jpg ex-089.jpg ex-128.jpg ex-167.jpg ex-206.jpg ex-245.jpg ex-284.jpg
ex-012.jpg ex-051.jpg ex-090.jpg ex-129.jpg ex-168.jpg ex-207.jpg ex-246.jpg ex-285.jpg
ex-013.jpg ex-052.jpg ex-091.jpg ex-130.jpg ex-169.jpg ex-208.jpg ex-247.jpg ex-286.jpg
ex-014.jpg ex-053.jpg ex-092.jpg ex-131.jpg ex-170.jpg ex-209.jpg ex-248.jpg ex-287.jpg
ex-015.jpg ex-054.jpg ex-093.jpg ex-132.jpg ex-171.jpg ex-210.jpg ex-249.jpg ex-288.jpg
ex-016.jpg ex-055.jpg ex-094.png ex-133.jpg ex-172.jpg ex-211.jpg ex-250.jpg ex-289.jpg
ex-017.jpg ex-056.jpg ex-095.jpg ex-134.jpg ex-173.jpg ex-212.jpg ex-251.jpg ex-290.jpg
ex-018.jpg ex-057.jpg ex-096.jpg ex-135.jpg ex-174.jpg ex-213.jpg ex-252.jpg ex-291.jpg
ex-019.jpg ex-058.jpg ex-097.jpg ex-136.jpg ex-175.jpg ex-214.jpg ex-253.jpg ex-292.jpg
ex-020.jpg ex-059.jpg ex-098.jpg ex-137.jpg ex-176.jpg ex-215.jpg ex-254.jpg ex-293.jpg
ex-021.jpg ex-060.jpg ex-099.jpg ex-138.jpg ex-177.jpg ex-216.jpg ex-255.jpg ex-294.jpg
ex-022.jpg ex-061.jpg ex-100.jpg ex-139.png ex-178.jpg ex-217.jpg ex-256.jpg ex-295.jpg
ex-023.jpg ex-062.jpg ex-101.jpg ex-140.jpg ex-179.jpg ex-218.jpg ex-257.jpg ex-296.jpg
ex-024.jpg ex-063.jpg ex-102.jpg ex-141.jpg ex-180.jpg ex-219.jpg ex-258.jpg ex-297.jpg
ex-025.jpg ex-064.jpg ex-103.jpg ex-142.jpg ex-181.jpg ex-220.jpg ex-259.jpg ex-298.jpg
ex-026.jpg ex-065.jpg ex-104.jpg ex-143.jpg ex-182.jpg ex-221.jpg ex-260.jpg ex-299.jpg
ex-027.jpg ex-066.jpg ex-105.jpg ex-144.jpg ex-183.jpg ex-222.jpg ex-261.jpg ex-300.jpg
ex-028.jpg ex-067.jpg ex-106.jpg ex-145.jpg ex-184.jpg ex-223.jpg ex-262.jpg ex-301.jpg
ex-029.jpg ex-068.jpg ex-107.jpg ex-146.jpg ex-185.jpg ex-224.jpg ex-263.jpg ex-302.jpg
ex-030.jpg ex-069.jpg ex-108.jpg ex-147.jpg ex-186.jpg ex-225.jpg ex-264.jpg ex-303.jpg
ex-031.jpg ex-070.jpg ex-109.png ex-148.jpg ex-187.jpg ex-226.jpg ex-265.jpg ex-304.jpg
ex-032.jpg ex-071.jpg ex-110.jpg ex-149.jpg ex-188.jpg ex-227.jpg ex-266.jpg ex-305.jpg
ex-033.jpg ex-072.jpg ex-111.jpg ex-150.jpg ex-189.jpg ex-228.jpg ex-267.jpg ex-306.jp2
ex-034.jpg ex-073.jpg ex-112.jpg ex-151.jpg ex-190.jpg ex-229.jpg ex-268.jpg ex-307.jp2
ex-035.jpg ex-074.jpg ex-113.jpg ex-152.png ex-191.jpg ex-230.jpg ex-269.jpg ex-308.jp2
ex-036.jpg ex-075.jpg ex-114.jpg ex-153.jpg ex-192.jpg ex-231.jpg ex-270.jpg
ex-037.jpg ex-076.jpg ex-115.jpg ex-154.jpg ex-193.jpg ex-232.jpg ex-271.jpg
ex-038.jpg ex-077.jpg ex-116.jpg ex-155.jpg ex-194.jpg ex-233.jpg ex-272.jpg
--- End code ---
It looks like this file is nothing but jpegs and the occasional jp2. That means to further (re)compress this file: what we really need to do is focus on the jpegs. The PDF is nothing but a wrapper around them. This is basically always the case for scanned PDFs.
We have several options to reduce the file sizes of images:
* Reduce their size (resolution)
* Reduce their bitdepth (number of colours)
* Increase their compression level (throw out more data during lossy/jpeg compression)
* Change the lossy codec to something else (eg from jpeg to jp2)
Sidenote: file quirks & better software
Unfortunately for us each page of this PDF is not a single jpeg. Each page has been split into multiple jpegs that are aligned together in a grid:
Some software does this, it's quite annoying :P
To work around this: we're going to use the imagemagick tools from now on. These are smart enough to treat each page as a single image. This particular software is very common and well known in the *nix and web-development worlds (many websites use it behind the scenes for image processing), but not well known in the Windows world, even though there is a Windows version available. The world is weird, this software is the duck's guts for anything that requires mass-editing of multiple files simultaneously.
My method
(1) Remove the very first and last pages. It's pretty to see the paper texture from the original manual cover, but it's all undulating and hard to compress. The second page of the PDF is the same as the cover any, just in black and white.
(2) Convert everything to greyscale. There is no point having colour anywhere in this document, so let's not make the compressor (eg jpeg) think it has to preserve it. Strip it all out.
(3) Remove ghosted text from other pages by fiddling with brightness/levels. I'm talking about this stuff (emphasized via editing):
You can't (normally) see this and it's useless, so let's get rid of it. The fact it's still there in the image means the last person's compressor was wasting space trying to keep it.
(4) Avoid reducing the resolution. Resolution is nice, esp in technical documents. There's nothing worse than a blurry technical diagram. The whole point of these documents is to make people happy, not grumpy.
(5) Try several different compressors (jpeg, jpeg2000, png, etc). PDFs can house quite a few different formats fine.
First attempt: adjust levels, greyscale, jpeg
--- Code: ---$ mkdir temp1
$ magick convert -density 100 Goldstar_OS-7020_Service_Manual.pdf -set colorspace Gray -level '25%,75%,0.3' -quality 70 'temp1/page%03d.jpeg'
$ rm temp1/page000.jpeg temp1/page099.jpeg
$ du -sh temp1/
9.1M
--- End code ---
9.1MiB is not bad, but we can be squeezier if we're clever.
Second attempt: adjust levels, greyscale, png
--- Code: ---$ convert -density 100 Goldstar_OS-7020_Service_Manual.pdf -set colorspace Gray -level '25%,75%,0.4' 'temp2/page%03d.png'
$ rm temp2/page000.png temp2/page099.png
$ du -sh temp2/
12M
--- End code ---
Eep, wrong way. This codec doesn't seem useful with this sort of image data.
... but what if we massage the image into something that png would actually like? Eg black and white (no greyscale)?
Third attempt: threshold into black and white, png
--- Code: ---$ convert -density 100 Goldstar_OS-7020_Service_Manual.pdf -set colorspace Gray -alpha off -auto-threshold OTSU 'temp3/page%03d.png'
$ rm temp3/page000.png temp3/page099.png
$ du -sh temp3/
1.7M
--- End code ---
:D
I think the resolution is a bit low however, so I up it to -density 200 and try again:
--- Code: ---$ du -sh temp4
3.8M
--- End code ---
Not as good filesize wise, but the diagrams are a lot more readable.
Whales:
Turning the individual image pages back into a single PDF
I think it's possible to use imagemagick to do this, but you have to tell it not to recompress the files a second time. Instead I used img2pdf (probably not easy to get on Windows):
--- Code: ---$ img2pdf temp3/* -o Goldstar_OS7020_servicemanual_tiny.pdf
$ img2pdf temp4/* -o Goldstar_OS7020_servicemanual_reasonable.pdf
--- End code ---
The results are attached.
I have not spent enough time quality checking: some of the large diagrams have unreadable numbers. It's also hard to differentiate grid lines and signal on the scope shots; perhaps thresholding in black-and-white wasn't the best option here.
Opinion
Don't throw out the original scans. Keep both the original scans PDF and a small PDF (like mine) side by side and never separate them. One for convenience, the other for when the convenient version screws up (unreadable detail). You can't recover data that has been thrown out.
Text searchability
Also known as OCR (optical character recognition). Never 100% accurate but useful. Not going into it here :P sounds like you already have something that works.
I want a program that can do all of this for me!
No such program will ever exist, just as there will never be a program that writes the manual for you. Every step of the above has required human analysis and choosing between one of many options.
There will be programs that will claim to compress PDFs, but their performance will be hit and miss because they will guess at what is best to do. None of these programs try to read the circuit diagrams afterwards and determine if they are readable or not. None of these programs can tell that ghosted text from other pages is not useful information. And so on.
We are after all trying to compress against human perception, not machine perception.
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version