|
Post by goolash on Nov 8, 2022 12:21:48 GMT
HI Everyone Probably this is the only place in the web where I can gat any help on my isue. So, I bought the nonfunctional a500 unit. Of course, I got no idea of what she was through. After fixing up some bad soldier job on the only electrolytic capacitor. I manage to boot her up to some extent. She boots up until the point where the music starts and then she goes off. Looking at the UART logs I have noticed two things - Probably NAND must be corrupted - She restarts herself after login prompt in console pops out: Starting THEA500...
starting pid 1020, tty '/dev/ttyS0': '/sbin/getty -L ttyS0 0 vt100 '
___ _ ___ _ _ _
| _ \___| |_ _ _ ___ / __|__ _ _ __ ___ ___ | | | |_ __| |
| / -_) _| '_/ _ \ | (_ / _` | ' \/ -_|_-< | |_| _/ _` |
|_|_\___|\__|_| \___/ \___\__,_|_|_|_\___/__/ |____\__\__,_|
RedquarkSix - THEA500 Mini
process '/sbin/getty -L ttyS0 0 vt100 ' (pid 1020) exited. Scheduling for restart.
starting pid 1032, tty '/dev/ttyS0': '/sbin/getty -L ttyS0 0 vt100 '
___ _ ___ _ _ _
| _ \___| |_ _ _ ___ / __|__ _ _ __ ___ ___ | | | |_ __| |
| / -_) _| '_/ _ \ | (_ / _` | ' \/ -_|_-< | |_| _/ _` |
|_|_\___|\__|_| \___/ \___\__,_|_|_|_\___/__/ |____\__\__,_|
RedquarkSix - THEA500 Mini
redquarksix login: [ 12.019565] sunxi_dramfreq_irq_enable_control: the key master doesn't exit
[ 12.027536] sunxi_dramfreq_irq_mask_control: the key master doesn't exit[ 12.035880] [NAND]shutdown_flush_write_cache
[ 12.041285] [NAND][NAND]shutdown end
[ 12.150701] Restarting system.
So my question is. Is there anyway to reflash the FW over the UART console ? I'm OK with desolder the NAND and reprogram it over the external programmer, however I want to avoid that due to proximity of the processor. I'm also attaching the whole bootlog here putty.log (12.36 KB)
|
|
|
Post by spannernick on Nov 8, 2022 13:56:21 GMT
To me it looks like it has a problem reading the bod file thats part of the carousel, it only does that if someone had tried to change it. This might help you, look for THEA500 Mini part in the post `On THEA500 - How to Copy Nandb to THEA500 and restore THEA500.` - thec64community.online/thread/76/2022-thec64-mini-maxi-bacup
|
|
|
Post by jj0 on Nov 8, 2022 14:40:37 GMT
There's a lot of background info here, and also RGL have described how to boot into a shell here. At first glance I suspect that something happened to the SoC backup key that is checked by the manhattan binary and if it is incorrect it reboots. oleavr has a solution for that, but it's not on this forum I think?
|
|
|
Post by goolash on Nov 8, 2022 15:45:02 GMT
thak you for the response. I'm actually having this bizzare error thats keeps me away from mounting the nand.ko and usb drive. / # [ 2.280013] HDMI cable is connected
[ 2.540961] scsi 0:0:0:0: Direct-Access USB Flash Memory PMAP PQ: 0 ANSI: 0 CCS
[ 2.551941] sd 0:0:0:0: [sda] 7839744 512-byte logical blocks: (4.01 GB/3.73 GiB)
[ 2.560935] sd 0:0:0:0: [sda] Write Protect is off
[ 2.566194] sd 0:0:0:0: [sda] Mode Sense: 23 00 00 00
[ 2.572560] sd 0:0:0:0: [sda] No Caching mode page found
[ 2.578390] sd 0:0:0:0: [sda] Assuming drive cache: write through
[ 2.589434] sd 0:0:0:0: [sda] No Caching mode page found
[ 2.595364] sd 0:0:0:0: [sda] Assuming drive cache: write through
[ 2.602970] sda: sda1
[ 2.607933] sd 0:0:0:0: [sda] No Caching mode page found
[ 2.613865] sd 0:0:0:0: [sda] Assuming drive cache: write through
[ 2.620674] sd 0:0:0:0: [sda] Attached SCSI removable disk
[ 4.350015] error: invalid cea_vic code:0
[ 4.354413] Sink do NOT Support cea vic mode:0
[ 4.359286] [HDMI2 error]: sink do not support this mode:0
I've tried different hdmi cables and momitors/tv's but same result :/
ok once cable was disconected it went further. i'm curently recover nanda and nanb fom the provided files. i'll let know how it went. . . . nu luck :/ recovering anada & nandb brought noting new. seme situation as before. jj0 where can I found something regarding that SoC backup key ? or maybe does anyone have a full dump of a500 nand ?
|
|
|
Post by goolash on Nov 9, 2022 11:22:57 GMT
I cannot see anything under the first link but the data that are in the 2 link are exeact the same as provided by spannernick. I was thinking about full NAND dump one image that includes Uboot env and all the NAND partitions. Recovering of the partition A & B brought nothing new
|
|
|
Post by jj0 on Nov 9, 2022 13:06:42 GMT
<...> u luck :/ recovering anada & nandb brought noting new. seme situation as before. jj0 where can I found something regarding that SoC backup key ? or maybe does anyone have a full dump of a500 nand ? The Soc backup key info was on a discord channel somewhere but that's gone. oleavr created a change to the kernel that negated the check. You'd need to extract the kernel from the nanda dump, apply the change and then rebuild a new nanda with it. Then try to boot it loading it from USB and boot it to see if that fixes your issue. The change consists of replacing the 4 bytes at offset 0x002479d4 with 1F 20 03 D5 (from 60 6A 36 B8).
|
|
|
Post by goolash on Nov 9, 2022 14:08:45 GMT
hmm I got no idea how can I rebuild the image after changing the bytes, (are there any tutorials online I could use ?) I just tried with the changed offset but (suprise, suprise) it failed due to bad magic number. Or maybe there is already somwhere such image available ?
|
|
|
Post by spannernick on Nov 9, 2022 15:35:21 GMT
<...> u luck :/ recovering anada & nandb brought noting new. seme situation as before. jj0 where can I found something regarding that SoC backup key ? or maybe does anyone have a full dump of a500 nand ? The Soc backup key info was on a discord channel somewhere but that's gone. oleavr created a change to the kernel that negated the check. You'd need to extract the kernel from the nanda dump, apply the change and then rebuild a new nanda with it. Then try to boot it loading it from USB and boot it to see if that fixes your issue. The change consists of replacing the 4 bytes at offset 0x002479d4 with 1F 20 03 D5 (from 60 6A 36 B8). If you have made the nanda with the offset in already, would it be easier to shear it, nanda is not under RGL copyright, its under GPL, just saying cause not everyone can build a new nanda.
|
|
|
Post by spannernick on Nov 9, 2022 15:45:43 GMT
Is these a way of finding the offset in the nanda image using HxD, if its only 4 bits, it has to be in there somewhere, or is complied different so it would not be there, just look for 60 6A 36 B8...?
|
|
|
Post by goolash on Nov 9, 2022 16:02:26 GMT
. I just tried with the changed offset but (suprise, suprise) it failed due to bad magic number. . already done that
|
|
|
Post by jj0 on Nov 9, 2022 17:24:08 GMT
hmm I got no idea how can I rebuild the image after changing the bytes, (are there any tutorials online I could use ?) I just tried with the changed offset but (suprise, suprise) it failed due to bad magic number. Or maybe there is already somwhere such image available ? Failed magic number means you hexedited the nanda file but that has a checksum (the magic number). The tutorial is described in general in the RGL link in my first reply except you will need to edit the zImage and not the initrd.img. Extract the nanda: abootimg -x nanda Then edit zImage. Then create a new nanda: abootimg --create nanda-nocheck -f bootimg.cfg -k zImage -r initrd.img And then try to boot with nanda-nocheck. But if you don't want to go through that exciting and educational process you can try with this one. I created this some time ago, it has some other experimental stuff in it (it will open a login shell on /dev/ttyUSB0 if available, you can plug in a FTDI (VID 0403 PID 6001) USB2Serial converter to use this) but it should work to test if the SoC backup key is the issue.
|
|
|
Post by goolash on Nov 9, 2022 19:54:57 GMT
Ok So I think I'll have to give up. I have reflash the whole nand using UART and provided commands but still same error. I also went that far to desolder the nand and reprogram it in the external programmer.. with same result I also flashed the nanda-jj0-v9-remove-no-tamper image as nanda but same result upon reboot. | _ \___| |_ _ _ ___ / __|__ _ _ __ ___ ___ | | | |_ __| |
| / -_) _| '_/ _ \ | (_ / _` | ' \/ -_|_-< | |_| _/ _` |
|_|_\___|\__|_| \___/ \___\__,_|_|_|_\___/__/ |____\__\__,_|
RedquarkSix - THEA500 Mini
process '/sbin/getty -L ttyS0 0 vt100 ' (pid 1034) exited. Scheduling for resta rt.
starting pid 1047, tty '/dev/ttyS0': '/sbin/getty -L ttyS0 0 vt100 '
___ _ ___ _ _ _
| _ \___| |_ _ _ ___ / __|__ _ _ __ ___ ___ | | | |_ __| |
| / -_) _| '_/ _ \ | (_ / _` | ' \/ -_|_-< | |_| _/ _` |
|_|_\___|\__|_| \___/ \___\__,_|_|_|_\___/__/ |____\__\__,_|
RedquarkSix - THEA500 Mini
redquarksix login: [ 8.492350] sunxi_dramfreq_irq_enable_control: the key mas ter doesn't exit
[ 8.499756] sunxi_dramfreq_irq_mask_control: the key master doesn't exit[ 8.522873] [NAND]shutdown_flush_write_cache
[ 8.529336] [NAND][NAND]shutdown end
[ 8.640306] Restarting system.
As mentioned in the first post I have no idea what she went through, only visible taper was on the16v/470uF capacitor. I checked other componets against that photo and everything is loking fine. So no idea if that could be a hardware issue ? I got some write error during the UART erase/flash procedure (ca 2 - 5 blocks). But that should not be the issue (I think ?)
|
|
|
Post by goolash on Nov 9, 2022 21:19:44 GMT
...hmmm I'm going through random logs of different H6 machines (not a500 mini) and I see quite offen same line
[ 8.466372] sunxi_dramfreq_irq_mask_control: the key master doesn't exit
Can anyone check if that line is present on a helthy boot ? (or provide a full bootlog ?)
|
|
|
Post by jj0 on Nov 9, 2022 21:22:00 GMT
Ok So I think I'll have to give up. I have reflash the whole nand using UART and provided commands but still same error. I also went that far to desolder the nand and reprogram it in the external programmer.. with same result I also flashed the nanda-jj0-v9-remove-no-tamper image as nanda but same result upon reboot. | _ \___| |_ _ _ ___ / __|__ _ _ __ ___ ___ | | | |_ __| |
| / -_) _| '_/ _ \ | (_ / _` | ' \/ -_|_-< | |_| _/ _` |
|_|_\___|\__|_| \___/ \___\__,_|_|_|_\___/__/ |____\__\__,_|
RedquarkSix - THEA500 Mini
process '/sbin/getty -L ttyS0 0 vt100 ' (pid 1034) exited. Scheduling for resta rt.
starting pid 1047, tty '/dev/ttyS0': '/sbin/getty -L ttyS0 0 vt100 '
___ _ ___ _ _ _
| _ \___| |_ _ _ ___ / __|__ _ _ __ ___ ___ | | | |_ __| |
| / -_) _| '_/ _ \ | (_ / _` | ' \/ -_|_-< | |_| _/ _` |
|_|_\___|\__|_| \___/ \___\__,_|_|_|_\___/__/ |____\__\__,_|
RedquarkSix - THEA500 Mini
redquarksix login: [ 8.492350] sunxi_dramfreq_irq_enable_control: the key mas ter doesn't exit
[ 8.499756] sunxi_dramfreq_irq_mask_control: the key master doesn't exit[ 8.522873] [NAND]shutdown_flush_write_cache
[ 8.529336] [NAND][NAND]shutdown end
[ 8.640306] Restarting system.
As mentioned in the first post I have no idea what she went through, only visible taper was on the16v/470uF capacitor. I checked other componets against that photo and everything is loking fine. So no idea if that could be a hardware issue ? I got some write error during the UART erase/flash procedure (ca 2 - 5 blocks). But that should not be the issue (I think ?) Embarrassing fact: I tried my nanda-jj0-v9-remove-no-tamper and it doesn't work for me either! I'm not sure what went wrong, I'll look into it. Sorry about that. But to prove that the reboot is due to the Soc Backup key try to temporarily remove the /etc/init.d/S99skyline startup script and then reboot. If your system then doesn't reboot I'm pretty sure it is due to the SoC Backup key.
|
|
|
Post by goolash on Nov 10, 2022 11:20:42 GMT
I dont think I'm follow the logic here. My system is not booting up right now, so once that script will be removed than the system will still not boot ? (different error message maybe?). Nevertheless it is worth a try ..however I'm not able to mount the nanda device,
either the proper partition
/ # mount /dev/nanda /a mount: mounting /dev/nanda on /a failed: Invalid argument
nor img file from usb stick
/ # mount /mnt/nanda.img /a mount: mounting K on /a failed: Invalid argument
I have created the "a" folder before so that is ok.
|
|
|
Post by jj0 on Nov 10, 2022 11:50:19 GMT
Ok So I think I'll have to give up. I have reflash the whole nand using UART and provided commands but still same error. I also went that far to desolder the nand and reprogram it in the external programmer.. with same result I also flashed the nanda-jj0-v9-remove-no-tamper image as nanda but same result upon reboot. [ 8.499756] sunxi_dramfreq_irq_mask_control: the key master doesn't exit[
As mentioned in the first post I have no idea what she went through, only visible taper was on the16v/470uF capacitor. I checked other componets against that photo and everything is loking fine. So no idea if that could be a hardware issue ? I got some write error during the UART erase/flash procedure (ca 2 - 5 blocks). But that should not be the issue (I think ?) Embarrassing fact: I tried my nanda-jj0-v9-remove-no-tamper and it doesn't work for me either! I'm not sure what went wrong, I'll look into it. Sorry about that. But to prove that the reboot is due to the Soc Backup key try to temporarily remove the /etc/init.d/S99skyline startup script and then reboot. If your system then doesn't reboot I'm pretty sure it is due to the SoC Backup key. OK, as it turned out not only the kernel patch is needed but also changing the SoC backup key to the one that THE500 expects, using oleavr 's fix-socinfo tool. I've added that to my modified nanda-jj0-v10-remove-no-tamper and tested it on my faulty THE500 and it works now for me. The sunxi_dramfreq_irq_mask_control: the key master doesn't exit (actually should be doesn't exist) should be OK I think, from the kernel source of drivers/devfreq/ddrfreq/sunxi_dramfreq.c: static void sunxi_dramfreq_irq_mask_control(struct sunxi_dramfreq *dramfreq, bool valid) { unsigned int i; int idx, reg_list_num;
sunxi_dramfreq_read_irq_mask_status(dramfreq); for (i = 0; i < MASTER_MAX; i++) { idx = key_masters_int_idx[i][1]; reg_list_num = key_masters_int_idx[i][2]; if (idx < 0 || reg_list_num < 0) { pr_err("%s: the key master doesn't exit", __func__); return; }
if (valid) { dramfreq->irq_access_mask_sta[reg_list_num] |= (0x1 << idx); dramfreq->irq_idle_mask_sta[reg_list_num] |= (0x1 << idx); } else { dramfreq->irq_access_mask_sta[reg_list_num] &= (~(0x1 << idx)); dramfreq->irq_idle_mask_sta[reg_list_num] &= (~(0x1 << idx)); } } sunxi_dramfreq_write_irq_mask_status(dramfreq); } But for the H6 (SUN50IW6) the idx and reg_list_num are <0 in key_masters_int_idx array: #elif defined(CONFIG_ARCH_SUN50IW6) /* the sun50iw6 doesn't support dramfreq, but need to offer cur_freq attr */ static int key_masters_int_idx[MASTER_MAX][3] = { { MASTER_NULL, -1, -1 }, }; I see the error as well in my boot log.
|
|
|
Post by goolash on Nov 10, 2022 13:08:48 GMT
jj0 you are a miracle maker!!! Once I've flashed v10 of your file my A500 came back to life. But.. what thatall actually means, what is the root cause here, how does the solution work ? I guess with each update my a500 will be dead again ?
|
|
|
Post by jj0 on Nov 10, 2022 15:34:25 GMT
jj0 you are a miracle maker!!! Once I've flashed v10 of your file my A500 came back to life. But.. what thatall actually means, what is the root cause here, how does the solution work ? I guess with each update my a500 will be dead again ? Thanks - it was oleavr who figured this out, I'm just applying his fix . Basically there's a key (the SoC backup key) programmed in the CPU/SoC by RGL. The Carousel checks if this key is correct and if not reboots. Apparently for some reason this key can be overwritten with 0's and then if a special bit (aka efuse) is permanently set in the CPU/SoC that ensures you can't re-write the key again to the one that is expected. I'm not sure what causes it, whether it is by accident or a 'anti tamper measure' by RGL. The fix is a modified kernel that doesn't read the actual key in combination with a tool that sets the correct key in the kernel's memory so that when the Carousel checks it the right key is returned. As I do this in nanda it will most likely not be affected by RGL updates because (statistically speaking) RGL doesn't update the nanda in their firmware updates. Obviously they could start checking a checksum of nanda and detect that it is different from the standard one but I don't think that will happen.
|
|
|
Post by goolash on Nov 10, 2022 18:27:43 GMT
corect ! I've succesfully updated the FW to the newest version thank you agian jj0 for the image and oleavr for the solution !!
|
|
|
Post by spannernick on Nov 11, 2022 0:53:14 GMT
Could a black out cause it too fail if THEA500 Mini gets a surge of electricity, does THEA500 Mini have protection from it, all RGL did to stop people messing around with it could of make more A500 get returned cause they stop working.
|
|
|
Post by goolash on Nov 11, 2022 7:58:27 GMT
so, at the end (correct me if I'm wrong) standars nanda image is NOT interchanable because of the SoC Key check ?
|
|
|
Post by jj0 on Nov 11, 2022 8:42:19 GMT
so, at the end (correct me if I'm wrong) standars nanda image is NOT interchanable because of the SoC Key check ? It is interchangeable, nanda does not do the SoC key check. nanda contains the Linux kernel (and a initial rootfs ramdisk image) and that should be the same for every THE500. Though of course RGL could do minor updates for new batches of THE500's e.g. if they use a different NAND chip. But when they did this for THE64 it was still backwards compatible I think. The SoC key is only checked when the Carousel starts, so that is after nanda has been used/loaded already.
|
|
|
Post by jj0 on Nov 11, 2022 8:43:09 GMT
Could a black out cause it too fail if THEA500 Mini gets a surge of electricity, does THEA500 Mini have protection from it, all RGL did to stop people messing around with it could of make more A500 get returned cause they stop working. I think the protection against a surge would/should be in the power supply.
|
|
|
Post by spannernick on Nov 11, 2022 11:27:16 GMT
ok, thanks... Info about the carousel.. The carousel checks the resources.bod using a hash and all the files in it too, I tried changing one of the images with a different one in its bod file and it stops the carousel and makes it power off like above so master key does not exit, the image I added you can see in the carousel too(the image was Favourite 1 image cover from PCUAE and I replaced the first image in the bod file, Alien Breed 3D by matching its code in HxD. So if we remove the check from nanda, the carousel might run if you change the image in the bod file cause its Soc backup key info that stopping the carousel running. What probably happening is Soc backup key is checking the bod file and getting the wrong hash and so it then shutdown and says the master key dose not exit. We need a easy way of replacing nanda so its easier on the user to change then we might be able to change the games in the carousel in its bod file and make different bod files as different sets of games. I even added some data to Favourites 1 cover png so the Favourites 1 image file was the same size as the Alien Breed 3D Cover png image and it did the same thing, powered off with master key does not exit. Here is a video of it trying to load the carousel with the Alien Breed 3D cover image changed to Favourites 1 cover but then the check makes the A500 Mini power off. ======================================================================================================= Make the video full screen so you can see its boot log in Putty, the left screen. Pause the video at 0.53 seconds and you see the key master does not exit after it loads the carousel, you can see it loads THEA500 Mini splash screen too so maybe the hash check is after the splash screen loads, cause the first png file in the bod file is the Alien Breed 3D cover, thats why I changed it.
|
|