ہمارے ارد گرد غیر استعمال شدہ compute

3 جون، 2026 • 26 min read

فہرستِ مضامین

کبھی کبھی infrastructure کا ایک بڑا سوال بہت چھوٹی سی تصویر سے شروع ہوتا ہے: رات کو ایک smartphone charger پر پڑا ہے۔ laptop بند ہے۔ gaming console living room میں انتظار کر رہا ہے۔ car garage میں کھڑی ہے۔ ہر طرف compute موجود ہے، جس کی قیمت پہلے ہی ادا ہو چکی ہے، جسے بجلی مل رہی ہے، اور جو زیادہ تر کچھ نہیں کر رہا۔

اسی وقت نئے data centers بن رہے ہیں، industrial plants جتنے بڑے۔ GPUs، fiber، transformers، cooling technology اور power contracts سے بھرے halls۔ ہم digital infrastructure کی ایک نئی layer بنا رہے ہیں، جو ہماری writing، searching، programming، analysis، اور شاید کبھی decisions تک کا بوجھ اٹھائے گی۔

کیا ہو اگر اس capacity کا ایک چھوٹا حصہ صرف ضائع نہ ہو؟ یہ کوئی wild fantasy نہیں جس میں ہر phone اچانک data center کی جگہ لے لے۔ بلکہ ایک سنجیدہ thought experiment ہے: کیا کوئی Compute-Smart-Grid ہو سکتا ہے، جس میں devices voluntarily، محدود طریقے سے، اور payment کے بدلے compute contribute کریں؟

PRISM سے Prompts تک میں اس development کا دوسرا پہلو ہے: چند AI platforms پر بڑھتا ہوا انحصار، خاص طور پر USA اور China سے۔ یہاں مجھے opposite idea دلچسپ لگتا ہے۔ naïve P2P romance کے طور پر نہیں، بلکہ technical سوال کے طور پر: کتنی compute capacity already distributed پڑی ہے، کون سے AI tasks واقعی distribute ہو سکتے ہیں، اور لوگوں کو fair payment دینے کے لیے کیا ہونا چاہیے؟

Compute کو store نہیں کیا جا سکتا۔ ایک unused GPU hour reserve نہیں ہے۔ وہ بس ختم ہو جاتی ہے۔

Thought experiment

AI صرف software نہیں ہے۔ AI electricity، cooling، fiber، GPUs، land، water اور capital ہے۔ International Energy Agency کا estimate ہے کہ data centers نے 2024 میں worldwide تقریباً 415 TWh electricity consume کی، یعنی global electricity consumption کا تقریباً 1,5 percent۔ 2030 تک یہ تقریباً 945 TWh ہو سکتا ہے۔

یہ صرف sustainability reports کا نمبر نہیں ہے۔ یہ infrastructure policy ہے۔ AI services 7x24 available ہیں۔ ہر summary، ہر code question، ہر image، ہر agent run ایک computation ہے۔ اور جب billions of people اور companies اپنا work AI loops میں ڈالیں، تو یہ baseload بن جاتا ہے۔

اسی لیے میں large central solutions کی fascination سمجھتا ہوں۔ Data centers controllable ہیں: same hardware، same racks، same networks، clear security zones، SLAs، monitoring، billing۔ operations کے لحاظ سے یہ attractive ہے۔ مگر politically، economically اور architecturally یہ دوبارہ وہی چیز پیدا کرتا ہے جس نے network کو ہمیشہ vulnerable بنایا: چند power centers۔

Thought experiment اس لیے ایک simple counter-question سے شروع ہوتا ہے: اگلا data center بنانے سے پہلے already کیا موجود ہے؟

Unused capacity کتنی بڑی ہے؟

یہ idea دراصل ایک بالکل everyday moment سے نکلتا ہے۔ smartphone رات کو bedside table پر پڑا ہے، power سے connected ہے، اور تقریباً کچھ نہیں کر رہا۔ مگر اس کے اندر ایک chip ہے، جس کے پاس AI-specific compute capacity اتنی ہے جتنی دس سال پہلے کئی computers کے whole package میں بھی نہیں تھی۔ A17 Pro والا iPhone 15 Pro تقریباً 35 trillion Neural Engine operations per second تک پہنچتا ہے۔ اگر ہم اس کا صرف cautious average بھی لیں، تو یہ ایک ایسے device کے لیے absurdly high ہے جو رات کا زیادہ تر حصہ انتظار کرتا ہے۔

desk پر بھی یہی ہوتا ہے۔ نئے notebooks میں صرف CPU اور GPU نہیں، بلکہ NPUs یا Neural Engines بھی ہیں۔ Apple years سے اپنی chips میں Neural Engine بنا رہا ہے۔ Windows notebooks dedicated AI processors کے ساتھ AI-PCs بن رہے ہیں۔ living room کا gaming console ایسی GPU power رکھتا ہے جو پہلے workstation جیسی لگتی تھی۔ پھر بھی ہم اس local compute کو زیادہ تر مختصر peaks میں use کرتے ہیں: game، export، video call، local effect، search۔ اس کے بعد device پھر idle میں چلا جاتا ہے۔

یہیں سے thought experiment شروع ہوتا ہے۔ یہ نہیں: “کیا میں کل اپنا iPhone data center کے طور پر rent کر سکتا ہوں؟” یہ nonsense ہوگا۔ بلکہ: اگر اتنا silicon already paid، networked، اور ہر رات power پر ہے، تو theoretical capacity کتنی بڑی ہوگی اگر صرف چھوٹی، safe، suitable time windows استعمال کی جا سکیں؟

دنیا کی unused compute capacity کو exact measure نہیں کیا جا سکتا۔ devices بہت different ہیں، بہت سے offline ہیں، بہت سے battery، heat، security یا platform reasons کی وجہ سے participate ہی نہیں کر سکتے۔ پھر بھی ایک rough calculation ہمیں order of magnitude کا احساس دیتی ہے۔

اس کے لیے ہم چند deliberately simple blocks لیتے ہیں۔ اہم بات: میں ایسے calculate نہیں کر رہا جیسے ہر device ہر وقت fully available ہو۔ میں time windows، participation rates، اور cautious discounts کے ساتھ calculate کر رہا ہوں۔ یہ thought experiment ہے، مگر numbers کے ساتھ۔

Tesla: wheels پر silicon

Tesla نے June 2025 میں اپنی آٹھ ملین ویں produced vehicle report کی۔ ہر vehicle active نہیں، ہر ایک کے پاس same Autopilot hardware نہیں، اور ہر owner اپنی car compute network کے لیے release نہیں کرے گا۔ اس لیے میں conservative calculation لیتا ہوں:

8 million produced vehicles میں سے شاید 80 percent realistically active اور technically relevant ہیں۔ یہ 6,4 million vehicles بنتی ہیں۔
Hardware 3، یعنی 2019 سے FSD Computer، کے لیے system کی 144 TOPS magnitude اکثر quote کی جاتی ہے۔
Hardware 4 newer vehicles میں ہے اور more modern ہے، مگر Tesla اس کے لیے older Autonomy-Day numbers جیسا clean، simple TOPS value publish نہیں کرتا۔ اس calculation کے لیے میں پھر بھی 144 TOPS کو conservative base value لیتا ہوں۔
car اکثر day کے 23 hours parked ہوتی ہے، مگر interesting window charging ہے۔ اگر وہ رات کو 6,5 hours power پر ہو، تو 24 hours average پر یہ تقریباً 27 percent availability ہے۔

اگر ان active Tesla owners میں سے صرف 25 percent sign up کریں، تو یہ 1,6 million vehicles اور تقریباً 62 Exa-Operations per second daily equivalent بنتا ہے۔ 50 percent participation پر یہ تقریباً 125 Exa-Operations per second ہوگا۔ اگر theoretically تمام active vehicles participate کریں، تو number تقریباً 250 Exa-Operations per second تک جائے گا۔ night window میں instantaneous power زیادہ ہوگی؛ daily-equivalent number صرف ایک data center سے fairer comparison ہے جو 24 hours چلتا ہے۔

iPhones: بڑی surprise

iPhones میں calculation ایک ساتھ آسان بھی ہے اور مشکل بھی۔ آسان اس لیے کہ Apple ہر سال enormous quantities ship کرتا ہے۔ مشکل اس لیے کہ Apple کوئی clean public table نہیں دیتا کہ کون سی iPhone generation worldwide اب بھی active ہے۔ اس لیے میں last years کی published shipments لیتا ہوں اور اس پر plausible active remaining ratio رکھتا ہوں۔

سال	shipped iPhones	rough chip mix	assumed active ratio	average Neural Engine performance
2021	235,7 million	A14/A15	55 %	12 TOPS
2022	226,4 million	A15/A16	65 %	16 TOPS
2023	234,6 million	A16/A17 Pro	75 %	22 TOPS
2024	233,1 million	A16/A17/A18	85 %	30 TOPS
2025	247,8 million	A18/A19	95 %	32 TOPS

اس mixed calculation سے صرف پانچ shipment years میں سے تقریباً 885 million probably still active iPhones نکلتے ہیں۔ یہ پوری active iPhone base نہیں، بلکہ deliberately limited slice ہے۔ older A14/A15 generations low double-digit TOPS range میں تھیں، A16 تقریباً 17 TOPS، اور A17 Pro تقریباً 35 TOPS۔ اس لیے ہر سال کا average زیادہ reasonable ہے بجائے اس کے کہ فرض کیا جائے تمام devices کے پاس same chip ہے۔

اب پھر وہی game: رات کو 6,5 hours power پر، پورا دن نہیں۔ اگر ان devices میں سے 25 percent participate کریں، تو daily equivalent تقریباً 1'437 Exa-Operations per second بنتا ہے۔ 50 percent participation پر یہ تقریباً 2'875 Exa-Operations per second ہوگا۔ اگر theoretically تمام devices participate کریں، تو number تقریباً 5'750 Exa-Operations per second ہوگا۔

یہ crazy لگتا ہے۔ مگر point یہی ہے۔ اس لیے نہیں کہ iPhone server ہے۔ بلکہ اس لیے کہ devices کی mass اتنی بڑی ہے کہ cautious ratios بھی اچانک اس order of magnitude میں پہنچ جاتے ہیں جسے ہم عموماً صرف data centers سے جوڑتے ہیں۔

Comparison

مزید reference points کے طور پر میں لیتا ہوں:

50 million Desktop-GPUs، Workstations یا small servers، جو average میں 20 TFLOPS FP32 دے سکتے ہوں۔ اگر ان میں سے صرف 20 percent practically usable ہوں، تو suitable time windows میں تقریباً 200 Exa-Operations per second بچتے ہیں۔
xAI Colossus data center world سے comparison کے طور پر۔ 200'000 Hopper-GPUs اور H100 magnitude کے تقریباً 3'958 INT8-TOPS کے ساتھ تقریباً 792 Exa-Operations per second theoretical AI peak performance بنتی ہے۔ یہ Sparsity-Peak ہے؛ dense اور continuously usable performance اس سے lower ہے۔

Theoretical silicon کا comparison

Rough Exa-Operations per second، 24 hours average پر۔ Tesla اور iPhones کو 6,5 hours night window کے ساتھ calculate کیا گیا ہے؛ chart theoretical participation rates دکھاتا ہے، آج available capacity نہیں۔

Tesla 50 % theoretical

125

PCs / Workstations

200

xAI Colossus

792

iPhones 25 % theoretical

1'437

iPhones 50 % theoretical

2'875

Important: یہ benchmark نہیں ہے۔ FP32-FLOPS، INT8-TOPS اور Neural-Engine-TOPS ایک دوسرے کے 1:1 substitute نہیں ہیں۔ memory، interconnect، software، verification، energy efficiency، platform rights اور real utilization decide کرتے ہیں کہ peak performance usable work بنتی ہے یا نہیں۔

یہ exact global capacity نہیں ہے۔ یہ thinking model ہے۔ اور یہی وہ جگہ ہے جہاں brake لگانی چاہیے: TOPS کو پانی کے لیٹروں کی طرح ایک shared pool میں simply نہیں ڈالا جا سکتا۔ iPhone کی Neural-Engine-TOPS، GPU کی INT8-TOPS، اور workstation کی FP32 performance different things ہیں۔ بہت سے meaningful jobs کو صرف operations نہیں بلکہ RAM، VRAM، memory bandwidth، stable runtime، software access، اور ایسا operating system چاہیے جو ایسے jobs allow بھی کرے۔

پھر بھی calculation دکھاتی ہے کہ idea ridiculous نہیں ہے۔ PCs، vehicles اور smartphones کا conservative combination بھی theoretical silicon کے طور پر اس order میں آتا ہے، جہاں دنیا کے سب سے visible AI data centers میں سے ایک کے ساتھ comparison absurdly small نہیں لگتا۔

iPhone number خاص طور پر interesting ہے، کیونکہ یہ صرف پانچ shipment years دیکھتا ہے، پوری active installed base نہیں۔ اسی وقت یہ بہترین example ہے کہ peak performance کافی نہیں: iPhone server نہیں ہے۔ اس کی heat limits، battery logic، operating-system rules، privacy models، اور ایک owner ہے جو صبح functional device expect کرتا ہے۔ پھر بھی وہاں ایسی compute پڑی ہے جو چند سال پہلے science fiction لگتی۔

اور یہ peak values بھی صرف peak values ہیں۔ smartphone، fanless notebook یا car control unit ایسی performance چھ hours تک data-center GPU کی طرح simply deliver نہیں کر سکتا۔ thermals، throttling اور protection logic sustained performance کو massively reduce کرتے ہیں۔ اگر کوئی اس سے real network بنانا چاہے، تو اسے sustained performance کے ساتھ calculate کرنا ہوگا، datasheet کی most beautiful number کے ساتھ نہیں۔

اسے electricity کے ذریعے بھی سوچا جا سکتا ہے۔ اگر 50 million devices average 150 watts کے ساتھ روزانہ four hours contribute کریں، تو یہ تقریباً 11 TWh per year ہوگا۔ یہ آج کے global data center consumption کا صرف small fraction ہے۔ لیکن یہ بہت سے background jobs، embeddings، scientific workloads، rendering، verification tasks یا decentralized storage processes اٹھانے کے لیے کافی ہو سکتا ہے۔

unpleasant objection یہ ہے: یہ automatically more efficient نہیں ہوتا۔ Data centers کے پاس better cooling، better utilization، cheaper power، newer hardware، اور professional batching ہے۔ home device useful compute work کے حساب سے worse ہو سکتا ہے، خاص طور پر اگر overhead زیادہ ہو، یا smartphone battery چند cents credits کے لیے تیزی سے پرانی ہو۔ اس لیے decentralized compute grid صرف distributed ہونے کی وجہ سے اچھا نہیں ہوگا۔ اسے suitable workloads کے لیے net useful ہونا ہوگا: technically، energetically اور economically۔

new AI-PCs کے ساتھ بات مزید interesting ہو جاتی ہے۔ Canalys نے 2025 کے لیے تقریباً 100 million AI-PC shipments expect کیے تھے۔ ان میں سے بہت سے devices 40 TOPS یا more والی NPUs لاتے ہیں۔ TOPS GPU-FLOPS جیسی چیز نہیں، اور NPU data center replace نہیں کرتی۔ مگر اگر ہم اس performance کو بہت cautiously بھی دیکھیں، تو local AI hardware کی ایک نئی class بنتی ہے، جو صرف paper پر نہیں بلکہ offices اور homes میں پہنچ رہی ہے۔

point یہ نہیں ہے: “ہم کل تمام data centers کو Gaming-PCs، Teslas اور iPhones سے replace کر دیں گے۔” point یہ ہے: ہم massive new central capacity بنا رہے ہیں، جبکہ اسی وقت distributed، already paid capacity کی ایک huge amount unused expire ہو رہی ہے۔

Compute ضائع ہو جاتی ہے

electricity کو میں store کر سکتا ہوں۔ perfect نہیں، lossless نہیں، مگر fundamentally۔ اگر میری solar system دوپہر کو میری need سے زیادہ produce کرے، تو energy battery یا grid میں جاتی ہے۔ evening میں میں اسے دوبارہ use کر سکتا ہوں، یا میرا neighbor use کرتا ہے۔ Smart Grids، battery storage، اور Peer-to-Peer energy models اس thinking کو increasingly concrete بنا رہے ہیں: کبھی میں produce کرتا ہوں، کبھی consume، اور customer/provider کی boundary softer ہو جاتی ہے۔

Compute differently work کرتی ہے۔

کل کی unused GPU hour کو میں آج drawer سے نہیں نکال سکتا۔ processor جس نے پوری رات کچھ نہیں کیا، اس نے later کے لیے compute save نہیں کیا۔ وہ time gone ہے۔ irretrievable۔ Compute perishable ہے۔

یہی unused devices کو interesting بناتا ہے۔ ہمارے پاس صرف hardware نہیں۔ ہمارے پاس مسلسل expire ہوتی possibilities ہیں۔ short-term realistic pool زیادہ تر Desktop-GPUs، workstations، gaming consoles، small servers، NAS storage، اور campus یا provider resources ہیں۔ smartphones اور cars long-term edge cases ہیں: technically fascinating، مگر battery، heat، platform rules، security، اور manufacturer control کی وجہ سے کافی harder۔

یہ صرف mathematics پر fail نہیں ہوتا، incentive پر بھی۔ سب سے interesting idle silicon والے devices closed platforms ہی کے ہیں: Apple بڑی Apple-Intelligence requests کے لیے Private Cloud Compute کے ساتھ اپنی infrastructure بناتا ہے، Tesla FSD اور Optimus کے لیے Cortex کے ساتھ اپنی training capacity۔ یہ companies اپنی device fleet کو manufacturer-independent compute market کے لیے کیوں کھولیں گی، جب hardware، software اور cloud پر control ہی real moat ہے؟

پھر بھی basic question باقی رہتا ہے: ہم distributed compute کو irrelevant کیوں treat کرتے ہیں، جبکہ اسی وقت increasingly larger central facilities build کرتے ہیں؟

کیا AI decentralized compute کر بھی سکتا ہے؟

یہاں honest ہونا ضروری ہے: آج جو بہت کچھ AI کے طور پر visible ہے، اس کے لیے decentralization difficult ہے۔

large language model simply small tasks کی list نہیں، جنہیں arbitrary stranger devices پر throw کر دیا جائے۔ models کو RAM یا VRAM چاہیے۔ memory bandwidth چاہیے۔ کچھ cases میں fast interconnects چاہیے۔ token generation میں model بار بار run ہوتا ہے، اور ہر additional network hop answer کو slower بناتا ہے۔ ایک Frontier model کو strangers کے smartphones، old laptops اور cars پر split کرنا interactive chat answer کے لیے mostly nonsense ہوگا۔

اس کا مطلب یہ نہیں کہ decentralized AI impossible ہے۔ مطلب صرف یہ ہے کہ right tasks choose کرنے ہوں گے۔

وہ work بہت fit بیٹھتا ہے جسے two seconds میں finish نہیں ہونا: large archives کے embeddings، batch summaries، rendering، scientific simulations، synthetic data، tests، crawling، verification tasks، decentralized storage repair، small local models، preprocessing، اور ایسے tasks جن کے results check یا multiple times compute کیے جا سکتے ہوں۔

Practically tasks کو much cleaner separate کرنا ہوگا:

Job class	Decentralized meaningful?	کیوں
Private local inference	Yes، مگر local	data اپنے device یا trust space میں رہتا ہے۔
Batch inference اور embeddings	Often yes	high throughput، seconds latency سے زیادہ important ہے۔
Verifiable subtasks	Yes، اگر checkable ہوں	results multiple times compute، attest، یا tests سے control ہو سکتے ہیں۔
Storage اور replication	Yes، rules کے ساتھ	encryption، Erasure Coding، audits اور repair mechanisms known building blocks ہیں۔
Frontier training اور hard SLAs	rather no	coupling بہت زیادہ، VRAM بہت زیادہ، interconnect، operations اور availability requirements بہت high۔

large models completely excluded نہیں، مگر انہیں different architecture چاہیے۔ Petals نے دکھایا کہ large models کی collaborative inference اور fine-tuning distributed resources پر fundamentally possible ہے۔ Prime Intellect INTELLECT-2 کے ساتھ ایک step آگے جاتا ہے اور دکھاتا ہے کہ untrusted workers کے ساتھ distributed Reinforcement Learning کیسے work کر سکتا ہے، اگر results verify کیے جائیں۔ یہ وہ world نہیں جہاں آپ کا iPhone رات کو silently GPT-7 train کرتا ہے۔ مگر یہ hint ہے کہ problem fundamentally impossible نہیں۔

realistic start اس لیے یہ نہیں ہوگا: “ہم ایک huge model ہر چیز پر distribute کر دیتے ہیں۔” realistic start ہوگا: local models first، suitable batch jobs کے لیے regional pools، verifiable tasks، clear data zones، اور central data centers صرف وہاں جہاں واقعی needed ہوں۔

Distributed systems کا پرانا خواب

internet کی ایک دوسری story بھی ہے۔ ایک ایسی story جو cathedral سے کم اور swarm سے زیادہ ملتی ہے۔

Volunteer computing

SETI@home میرے لیے ہمیشہ سب سے beautiful examples میں سے ایک تھا۔ millions of people نے اپنے computers کو background میں radio astronomy data compute کرنے دیا۔ اس لیے نہیں کہ انہیں اس کے بدلے SaaS dashboard ملا، بلکہ اس لیے کہ idea کافی بڑا تھا: ہم universe کے noise میں signals کو together search کرتے ہیں۔ March 2020 سے SETI@home کوئی نئی Work Units distribute نہیں کرتا اور ایک طرح کی hibernation میں ہے۔ مگر proof کے طور پر کہ volunteer computing globally work کر سکتی ہے، یہ important رہتا ہے۔

BOINC، اس کے پیچھے اور ساتھ والی platform، soberly explain کرتی ہے کہ یہ کیوں work کرتا ہے: many independent، compute-intensive jobs، جہاں throughput low latency سے زیادہ important ہے۔ یہی decisive difference ہے۔ distributed system کو ہر interactive chat answer two seconds میں deliver کرنے کی ضرورت نہیں۔ یہ وہاں strong ہو سکتا ہے جہاں work divisible، verifiable، اور immediate due نہیں ہے۔

Fixed place کے بغیر storage

IPFS یہی thought storage area میں لے جاتا ہے۔ files کو primarily location سے نہیں بلکہ content سے address کیا جاتا ہے۔ content کا fingerprint ہوتا ہے۔ جس کے پاس content ہے، وہ deliver کر سکتا ہے۔ یہ “یہ file اس server پر اس URL کے نیچے ہے” سے different thinking ہے۔

Central bookkeeping کے بغیر money

Bitcoin نے، speculation اور energy consumption کو آپ جیسے بھی evaluate کریں، ایک similar original idea popular کیا: central bookkeeping کے بغیر system، جہاں consensus کسی single institution پر depend نہیں کرتا۔ ہر decentralized idea automatically good نہیں۔ مگر Bitcoin نے دکھایا کہ protocol politically powerful ہو سکتا ہے اگر central control point remove ہو جائے۔

Storage as network

storage میں بھی interesting attempts تھیں۔ Symform decentralized cloud-storage provider تھا، جہاں excess storage network میں contribute کیا جا سکتا تھا۔ 2014 میں Quantum نے platform acquire کی؛ اس وقت 170 countries میں 45'000 users اور small companies کا ذکر تھا۔ Storj، Sia، Filecoin اور other variants بھی دکھاتے ہیں: idea new نہیں ہے۔ بس everyday life میں کبھی پوری طرح نہیں آتا۔

آج یہ idea نئی forms میں زندہ ہے۔ Storj files کو client-side encrypted chunks میں توڑ کر many Storage Nodes میں distribute کرتا ہے۔ یہ romance سے زیادہ infrastructure کے قریب ہے: user ideally swarm نہیں دیکھتا، بلکہ storage service دیکھتا ہے جو work کرتی ہے۔

Compute as marketplace

Golem اور Akash unused compute کو marketplace کے طور پر accessible بنانا چاہتے ہیں۔ میرے لیے یہی اس article کا direct bridge ہے: صرف storage space distributed نہیں پڑا، processors، GPUs اور small servers بھی ہیں جو آج اکثر idle رہتے ہیں۔

Distributed swarm میں AI

Andrej Karpathy بھی اس environment میں دوبارہ ظاہر ہوتا ہے: Prime Intellect میں اسے prominent supporter کے طور پر mention کیا جاتا ہے، اور Prime Intellect نے INTELLECT-2 کے ساتھ 32B-parameter model کے لیے decentralized distributed RL training round start کیا، جہاں heterogeneous، permissionless compute resources contribute کر سکتے ہیں۔

یہ perfect answer نہیں۔ مگر یہ دکھاتا ہے: dream disappeared نہیں ہوا۔ وہ صرف بار بار ایسی form search کرتا ہے جو real operations میں survive کرے۔

Virtual power plant سے سیکھنا

interesting یہ ہے کہ electricity sector میں یہی thinking کافی کم exotic لگتی ہے۔

Tesla اپنے Virtual Power Plant کو distributed energy resources کے network کے طور پر describe کرتا ہے: solar systems اور Powerwalls والے homes کو together power plant کی طرح treat کیا جاتا ہے۔ جب grid کو support چاہیے ہو، batteries electricity دے سکتی ہیں۔ owner resource provide کرتا ہے اور بدلے میں money یا other benefits پاتا ہے۔ individual Powerwalls small ہیں۔ together وہ grid کے لیے relevant ہو سکتی ہیں۔

Compute میں یہی analogy مجھے fascinate کرتی ہے۔ homeoffice کی ایک GPU، ایک NAS، ایک iPhone یا ایک car data center نہیں ہے۔ مگر many devices together ایک نئی layer بنا سکتے ہیں: ہر چیز کے لیے نہیں، ہر وقت نہیں، rules کے بغیر نہیں، مگر certain tasks کے لیے۔

analogy کی limits ہیں۔ electricity compute work سے much more fungible ہے۔ ایک kilowatt-hour اس پر depend نہیں کرتی کہ اسے 80 GB VRAM والا model، embedding pipeline، یا encrypted storage repair run کرنا ہے۔ Compute workload-dependent ہے۔ اسی لیے job classes، scheduling اور hard exclusions needed ہیں۔

Tesla میں یہی idea دو جگہوں پر بھی نظر آتا ہے۔ Powerwalls virtual power plant کا part بن سکتی ہیں۔ cars eventually autonomous Robotaxi fleet کا part بنیں گی، یعنی جب owner انہیں خود use نہیں کر رہا تو money earn کریں۔ یہ واقعی کیسے اور کتنی جلدی scale ہوگا، الگ question ہے۔ مگر basic idea important ہے: private device صرف consume نہیں ہوتا، بلکہ free time windows میں infrastructure کے طور پر work کر سکتا ہے۔

Compute کو بھی similarly سوچا جا سکتا ہے۔ neighbor کو electricity sale کے طور پر نہیں، بلکہ regional cell کو verifiable compute time، storage space یا model work sale کے طور پر۔ user ultimately electricity، heat، hardware wear اور risk pay کرتا ہے۔ اس لیے اسے compensate بھی ہونا چاہیے۔ اس point کے بغیر idea صرف خوبصورت technical experiment بن جاتا ہے۔

Swarm اتنی کم بار کیوں جیتتا ہے

اگر decentralization اتنی beautiful لگتی ہے، تو یہ simply win کیوں نہیں کرتی؟

کیونکہ centralization اکثر better product package ہے۔

data center controllable ہے۔ swarm stranger devices، different operating systems، changing availability، poor predictability، اور ایسے owners پر مشتمل ہے جو device بند، sell، update یا network سے disconnect کر دیتے ہیں۔ product manager کے لیے یہ romance نہیں، headache ہے۔

اس میں economics بھی آتی ہے۔ many decentralized projects نے incentives کو tokens کے ذریعے solve کرنے کی کوشش کی۔ یہ understandable ہے، کیونکہ central company کے بغیر network کو بھی compensation چاہیے۔ مگر جیسے ہی storage یا compute time کی costs volatile currency سے linked ہوں، normal companies کے لیے یہ unattractive ہو جاتا ہے۔ میں نہیں چاہتا کہ میرا terabyte backup اچانک expensive ہو جائے کیونکہ Twitter پر کوئی coin pump ہو گیا۔ میں یہ بھی نہیں چاہتا کہ GPU-hours کا budget ایسے market پر depend کرے جو infrastructure سے زیادہ casino لگتا ہے۔

اور price opponent cloud کی most expensive on-demand GPU نہیں ہے۔ real comparison spot اور preemptible offers ہیں، یعنی data center کی excess capacity جو providers significant discount پر sell کرتے ہیں۔ decentralized compute network کو صرف philosophically nicer نہیں ہونا ہوگا۔ اسے very cheap، well-integrated، مگر interruptible cloud capacity کے خلاف stand کرنا ہوگا۔

second brake convenience ہے۔ S3 نے اس لیے win نہیں کیا کہ وہ philosophically beautiful تھا۔ اس نے win کیا کیونکہ وہ simple enough، documented enough، اور everywhere integrated تھا۔ اگر decentralized storage یا compute networks relevant ہونا چاہتے ہیں، تو انہیں developers اور admins کے لیے almost boring feel ہونا ہوگا: API key enter کرو، bucket create کرو، monitoring، invoice، SLA، restore test، done۔

پھر security آتی ہے۔ corporate network میں اچانک کوئی stranger compute job inbound workstations پر land نہیں ہونا چاہیے۔ کوئی بھی reasonable firewall اسے block کرے گی اور threat-intelligence systems کے لیے suspicious لگے گا۔ practically ایسا system اندر سے باہر کی طرف work کرے: node cell سے contact کرے، checked jobs fetch کرے، sandbox میں run ہو، اور صرف وہ data دیکھے جو اسے دیکھنے کی اجازت ہے۔ ورنہ legitimate compute network network level پر جلد ہی بہت polite botnet جیسا لگے گا۔

trust اگلا hard point ہے۔ decentralized systems کو prove کرنا ہوگا کہ work correctly done ہوا، بغیر اس کے کہ ہر node کو سب کچھ دیکھنے کی اجازت ہو۔ storage میں known building blocks ہیں: encryption، Erasure Coding، audits، repair mechanisms۔ AI اور compute میں یہ harder ہو جاتا ہے۔ میں کیسے check کروں کہ stranger device نے model correctly execute کیا؟ data leakage کیسے روکوں؟ participant کے device کو stranger code سے کیسے protect کروں؟

hardware wear electricity سے زیادہ ہے۔ SSDs اور NVMe storage کے write limits ہوتے ہیں۔ جو model weights، temporary data، embedding batches یا swap files constantly consumer devices پر write کرتا ہے، وہ real lifetime consume کرتا ہے۔ اس کے علاوہ bandwidth problem ہے: اگر large model یا dataset کا download actual computation سے longer ہو اور more network overhead create کرے، تو calculation flip ہو جاتی ہے۔ data simple smart-grid metaphor سے heavy ہے۔

یہیں INTELLECT-2 interesting ہو جاتا ہے۔ Prime Intellect اپنے paper میں TOPLOC کو building block کے طور پر describe کرتا ہے، جو untrusted inference workers کے rollouts verify کرتا ہے۔ یہ suddenly تمام compute problems solve نہیں کرتا۔ یہ arbitrary company data کے لیے stranger hardware پر magical privacy نہیں ہے۔ مگر یہ distributed AI work کی ایک specific class کے لیے real mechanism دکھاتا ہے: jobs اس طرح built ہوتے ہیں کہ results checkable ہوں، بجائے اس کے کہ ہر worker پر blindly trust کیا جائے۔

confidential data کے لیے یہ alone کافی نہیں۔ وہاں other building blocks چاہیے: Confidential Computing، Trusted Execution Environments، Remote Attestation، clean Sandboxes، clear data classification، اور doubt میں hard decision کہ certain jobs کو stranger hardware پر run ہی نہ کیا جائے۔ اس کے ساتھ boring مگر decisive questions آتے ہیں: tax، liability، data protection، data residency، اور internet providers کی terms۔ infrastructure rarely صرف mathematics پر fail ہوتی ہے۔ اکثر operations پر fail ہوتی ہے۔

Compute-Smart-Grid

میں naive “everything is P2P and then everything is good” imagine نہیں کرتا۔ infrastructure ایسے work نہیں کرتی۔ جو work کر سکتا ہے، وہ clear layers والا Compute-Smart-Grid ہوگا۔

first layer ہے: local first۔ جو کچھ personal، confidential یا latency-critical ہے، اسے ideally اپنے device یا اپنے trust space میں run ہونا چاہیے۔ small models، local search، private summaries، simple classification، preprocessing، encryption، personal data کے embeddings۔ ہر email، ہر note، ہر search کو hyperscaler تک جانے کی ضرورت نہیں۔

second layer regional اور federated ہوگی۔ ایک city، neighborhood، campus، company، cooperative یا provider cell operate کر سکتا ہے۔ اس cell میں devices voluntarily resources provide کریں، مگر clear conditions کے تحت: only on power، only idle، only within thermal limits، only defined max performance، only specific job classes کے لیے۔

starting point smartphones اور cars نہیں ہوں گے، بلکہ boring devices: Desktop-GPUs، Workstations، gaming consoles، small servers، NAS systems، اور local providers کی free capacity۔ smartphones بعد میں small verification jobs لے سکتے ہیں۔ cars اس سے بھی later imaginable ہیں، بہت narrow، manufacturer-controlled limits میں۔ electricity grid کی طرح پہلے ان resources سے شروع کرنا ہوگا جو reliable، measurable اور controllable ہیں۔

third layer central رہتی ہے۔ Frontier training، hard realtime، extremely large models، regulatory sensitive special cases، اور high coupling والے workloads professional data centers ہی میں رہتے ہیں۔ decentralization کو سب کچھ replace کرنے کی ضرورت نہیں۔ اسے صرف اتنا روکنا ہے کہ ہر everyday work automatically انہی five power centers سے گزرے۔

اگر اسے test کرنا ہو، تو میں small شروع کروں گا۔ millions of iPhones کے ساتھ نہیں، بلکہ شاید 500 سے 2'000 volunteer Desktop-GPUs، Workstations، NAS systems اور small servers کی regional cell کے ساتھ۔ صرف few job types allowed ہوں: non-sensitive data کے embeddings، scientific batch jobs، encrypted storage pieces، اور verification tasks۔ success کسی beautiful Exa number سے measure نہیں، بلکہ تین boring metrics سے: $1 electricity cost پر completed jobs، error اور retry rate، hardware wear کے بعد payout۔

hardest part compensation ہوگا۔ user electricity، heat اور hardware wear pay کرتا ہے۔ اسے something back چاہیے۔ شاید واقعی token یا credit چاہیے ہوگا۔ مگر speculation object نہیں، infrastructure credit کے طور پر۔

ایسا Compute-Credit کسی real چیز کو represent کرے: specific class کی GPU minute، GB-month storage، verified inference، batch-embedding unit، یا kWh-equivalent compute unit۔ جو resources دیتا ہے، credits earn کرتا ہے۔ جسے بعد میں AI performance چاہیے، وہ انہیں consume کرتا ہے۔ جو consume نہیں کرنا چاہتا، وہ fiat payout لے سکتا ہے، اسی طرح جیسے virtual power plant میں کوئی “Powerwall-Coins” میں payment نہیں چاہتا، بلکہ real money یا clear credit چاہتا ہے۔

اس سے price question magically solve نہیں ہوتی۔ stability کو anchor چاہیے: fiat billing، energy-price corridors، regional clearing houses، cooperative tariffs، یا regulated operators۔ governance کے بغیر “stable credits” جلد ہی freely floating token بن جاتے ہیں۔ اور پھر ہم old problem پر واپس ہیں: infrastructure casino جیسی feel ہوتی ہے۔

اس سے بھی important operating rights کا سوال ہے۔ ہمیں ہر large Foundation Model خود train کرنے کی ضرورت نہیں۔ شاید models، open weights یا model families خریدے یا license کیے جائیں، اور پھر انہیں decentralized، federated، regionally controlled operate کیا جائے۔ actual sovereignty پھر صرف training میں نہیں، operations میں ہوگی: models کہاں run ہوتے ہیں؟ data کہاں پڑا ہے؟ audit کون کر سکتا ہے؟ provider کی طرف back channel ہے؟ اگر politics، prices یا terms بدل جائیں تو کیا میں model کو locally continue operate کر سکتا ہوں؟

تاکہ یہ صرف خوبصورت purchasing contract نہ رہے، ایسی licenses میں real operating rights ہونے چاہئیں: local deployments، long-term update اور security promises، understandable model cards، auditability، clear exit rights، اور sensitive data کو دوبارہ central manufacturer cloud میں push کرنے کی کوئی obligation نہیں۔ یہ pure decentralized utopia نہیں ہوگی۔ مگر naive self-sufficiency اور total platform lock-in کے درمیان realistic path ہوگی۔

وہ رات جب devices compute کرتے ہیں

تصور کریں، time 22:43 ہے۔ آپ کا GPU desktop idle میں ہے، NAS online ہے، phone charge ہو رہا ہے۔ settings میں آپ نے define کیا ہے: maximum 80 watts، only idle، region سے صرف verified workloads، اور صرف جب compensation electricity cost plus hardware allowance cover کرے۔

local agent free capacity report کرتا ہے۔ آپ کے name کے ساتھ نہیں، آپ کے private data کے ساتھ نہیں، بلکہ certain capabilities والی attested node کے طور پر۔ cell small jobs distribute کرتی ہے: simulations، embeddings، encrypted storage pieces، verification tasks۔

صبح کوئی rocket نہیں، کوئی Wall Street story نہیں، کوئی hype token نہیں۔ صرف ایک sober line:

اس رات: 2,4 GPU-Credits earned، 18 GB-months Storage confirmed، $0.31 electricity cost estimated۔

بعد میں آپ ان credits کو اپنے documents پر local model کے لیے use کرتے ہیں۔ sensitive data آپ کے پاس رہتا ہے۔ آپ صرف customer نہیں۔ آپ participant ہیں۔

یہ romantic لگتا ہے۔ ہاں۔ مگر کبھی کبھی یہی reason ہوتا ہے کہ difficult engineering problem کو seriously لیا جائے۔

Swarm اور mountain

مجھے نہیں لگتا کہ central data centers disappear ہوں گے۔ وہ too efficient، too important، اور کچھ tasks کے لیے simply necessary ہیں۔ mountain remains۔ سوال صرف یہ ہے کہ کیا ہم اس کے ساتھ دوبارہ ground build کرتے ہیں۔

local devices، regional cells، open protocols، stable credits اور clear security models کا ground۔ ایسا ground جہاں compute صرف top سے bottom sale نہ ہو، بلکہ participants کے درمیان flow کرے۔ ایسا ground جہاں certain AI work وہاں run ہو جہاں وہ belong کرتا ہے: private work locally، regional work regionally، global edge cases data center میں۔

شاید یہ naïve ہے۔ شاید نہیں۔ Virtual power plants بھی کبھی strange idea تھے: ہزاروں small batteries as one large network۔ decentralized money کافی عرصے absurd لگتا رہا۔ cars جو autonomous taxi کے طور پر drive کریں، science fiction لگتی تھیں۔ ان میں سے سب کچھ وعدوں کے مطابق نہیں آئے گا۔ مگر direction clear ہے: resources جو پہلے passively کھڑی رہتی تھیں، increasingly larger system کے part کے طور پر سوچی جا رہی ہیں۔

بالکل ابھی unused machines ہر جگہ کھڑی ہیں۔ homes، offices، garages، server rooms اور pockets میں۔ ہر ایک equally suitable نہیں۔ ہر ایک کو کبھی بھی stranger work execute نہیں کرنا چاہیے۔ مگر بہت سی already موجود، paid اور networked ہیں۔ اور ہر unused second disappear ہو جاتی ہے۔

شاید ہمیں انہیں سننا شروع کرنا چاہیے۔

نیک تمناؤں کے ساتھ،
Joe

ذرائع