I was thinking of using simple optocouplers like PC817, But I can get around 9600bps with these simple ones, So the question is can we do it with sub 0.1USD total parts, do we have this option?
My guess is that these devices are not silicon and also that they require more complicated packaging (two separate dice with an optical channel between them). There likely hasn't been a big push for cost optimization.
For the "normal" optos (4N series), you need to pay attention to the rise and fall times on the datasheets. You'll want the rise and fall to be at worst 20% of your bit period (so tr+tf < 1.7 us). The PS2701 is still too slow.
I was able to find the 6N137 which has an integrated digital receiver for about half the cost of CA-IS3721HS. Its datasheet headline states 10 MBd, so should work fine at 115200. Find clones on LCSC for about US$0.17.